State of the Art Natural Language Processing
A unified analytics engine for large-scale data processing
Spark-TTS Inference Code
A free, open-source, and cross-platform big data analytics framework
Docker image used to run data processing workloads
Apache Spark to Apache Cassandra connector
Jupyter magics and kernels for working with remote Spark clusters
A unified interface for distributed computing
Apache Kyuubi is a distributed and multi-tenant gateway
R interface for Apache Spark
Simple and distributed Machine Learning
Deequ is a library built on top of Apache Spark
Command-line tool from the Alire project and supporting library
A Spark library for Amazon SageMaker
A Scala kernel for Jupyter
Cluster computing framework for processing large-scale geospatial data
Distributed DataFrame for Python designed for the cloud
An end-to-end, realtime and cloud native Lakehouse framework
Scalable and Flexible Gradient Boosting
Monitor the stability of a Pandas or Spark dataframe
Python Stream Processing
Apache Iceberg
Mirror of Apache Phoenix
Apache Polaris, the interoperable, open source catalog
A Cloud Native Batch System (Project under CNCF)