17 projects for "spark" with 2 filters applied:

  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI. Switch between models without switching platforms.
    Start Free
  • 1
    Apache Spark

    Apache Spark

    A unified analytics engine for large-scale data processing

    ...With Spark Streaming (microbatches) and Structured Streaming, it delivers low-latency event processing suitable for real-time analytics. The built-in MLlib library provides scalable machine learning algorithms, while GraphX enables graph computations integrated with data pipelines. Spark supports multiple languages—Scala, Java, Python, R—and connects with many storage systems like HDFS, S3, Cassandra, and streaming platforms like Kafka, making it a versatile choice for big data workloads in analytics, ETL, and data science.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 2
    Deequ

    Deequ

    Deequ is a library built on top of Apache Spark

    Deequ is a library built atop Apache Spark that enables defining “unit tests for data” — that is, formal constraints or checks on datasets to ensure data quality along dimensions such as completeness, uniqueness, value ranges, correlations, etc. It can scale to large datasets (billions of rows) by translating those data checks into Spark jobs. Deequ supports advanced features like a metrics repository for storing computed statistics over time, anomaly detection of data quality metrics, and the suggestion of likely constraints automatically for new datasets. ...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 3
    Apache Sedona

    Apache Sedona

    Cluster computing framework for processing large-scale geospatial data

    Apache Sedona™ is a cluster computing system for processing large-scale spatial data. Sedona extends existing cluster computing systems, such as Apache Spark and Apache Flink, with a set of out-of-the-box distributed Spatial Datasets and Spatial SQL that efficiently load, process, and analyze large-scale spatial data across machines. According to our benchmark and third-party research papers, Sedona runs 2X - 10X faster than other Spark-based geospatial data systems on computation-intensive query workloads. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4
    Laravel Lang

    Laravel Lang

    List of 126 languages for Laravel Framework, Laravel Jetstream, etc.

    List of 126 languages for Laravel Framework, Laravel Jetstream, Laravel Fortify, Laravel Breeze, Laravel Cashier, Laravel Nova, Laravel Spark and Laravel UI. It is recommended to use this particular package as it will allow you to very quickly update all the necessary dependencies that ensure application localization.
    Downloads: 5 This Week
    Last Update:
    See Project
  • $300 in Free Credit Towards Top Cloud Services Icon
    $300 in Free Credit Towards Top Cloud Services

    Build VMs, containers, AI, databases, storage—all in one place.

    Start your project in minutes. After credits run out, 20+ products include free monthly usage. Only pay when you're ready to scale.
    Get Started
  • 5
    Soot

    Soot

    Soot - A Java optimization framework

    Soot is a Java optimization framework. It provides four intermediate representations for analyzing and transforming Java bytecode. Baf: a streamlined representation of bytecode which is simple to manipulate. Jimple: a typed 3-address intermediate representation suitable for optimization. Shimple: an SSA variation of Jimple. Grimp: an aggregated version of Jimple suitable for decompilation and code inspection.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 6
    Serverless Java container

    Serverless Java container

    A Java wrapper to run Spring, Spring Boot, Jersey, and other apps

    The AWS Serverless Java Container library is a framework that allows developers to run existing or new Java web applications—built with frameworks such as Spring, Jersey, Spark, Struts—inside AWS Lambda with minimal modifications. It bridges the gap between traditional servlet or web-framework models and serverless functions by mapping HTTP events from API Gateway into requests your framework understands and routing responses back appropriately. This means you can keep much of your familiar Java-based architecture (controllers, filters, dependency injection) and deploy it in a serverless environment without rewriting everything from scratch. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Kedro

    Kedro

    A Python framework for creating reproducible, maintainable code

    Kedro is an open sourced Python framework for creating maintainable and modular data science code. Provides the scaffolding to build more complex data and machine-learning pipelines. In addition, there's a focus on spending less time on the tedious "plumbing" required to maintain data science code; this means that you have more time to solve new problems. Standardises team workflows; the modular structure of Kedro facilitates a higher level of collaboration when teams solve problems...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 8
    Smallpond

    Smallpond

    A lightweight data processing framework built on DuckDB and 3FS

    ...The idea is to preserve DuckDB’s fast analytics engine but lift it from single-node to multi-node settings, giving you the ability to operate on large datasets (e.g. petabyte scale) without moving to a heavyweight system like Spark. Users write Python-like code (via DataFrame APIs or SQL strings) to express their transformations; behind the scenes, tasks are scheduled (often via Ray) and pushed into DuckDB instances operating on partitioned data. Because the storage layer (3FS) is optimized for random access and high throughput, smallpond can shuffle data, repartition, and manage intermediate results across nodes.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    WTFJS

    WTFJS

    A list of funny and tricky JavaScript examples

    ...It’s designed as both a fun read and a serious learning aid, helping developers build an intuition for how JavaScript evaluates expressions. By highlighting common misconceptions, it encourages safer coding patterns and more reliable mental models. Teachers, interviewers, and learners use it to spark discussion and deepen understanding of JavaScript’s semantics.
    Downloads: 8 This Week
    Last Update:
    See Project
  • Custom VMs From 1 to 96 vCPUs With 99.95% Uptime Icon
    Custom VMs From 1 to 96 vCPUs With 99.95% Uptime

    General-purpose, compute-optimized, or GPU/TPU-accelerated. Built to your exact specs.

    Live migration and automatic failover keep workloads online through maintenance. One free e2-micro VM every month.
    Try Free
  • 10
    Albedo

    Albedo

    A recommender system for discovering GitHub repos

    Albedo is an open-source recommender system aimed at helping developers discover GitHub repositories by learning from activity signals. It treats repositories and developers as a graph of interactions and applies large-scale matrix factorization to model affinities, with Apache Spark providing the distributed data processing. The project focuses on implicit feedback—stars, watches, and other engagement metrics—so it can build useful recommendations without explicit ratings. A reproducible setup and Makefile-driven workflow streamline tasks like spinning up services, loading datasets, training models, and generating candidate lists. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    PixieDust

    PixieDust

    Python Helper library for Jupyter Notebooks

    PixieDust is an open source Python helper library that works as an add-on to Jupyter notebooks to improve the user experience of working with data. It also fills a gap for users who have no access to configuration files when a notebook is hosted on the cloud.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    benchm-ml

    benchm-ml

    A benchmark of commonly used open source implementations

    ...The benchmarks cover algorithms like logistic regression, random forest, gradient boosting, and deep neural networks, and they compare across toolkits such as scikit-learn, R packages, xgboost, H2O, Spark MLlib, etc. The repository is structured in logical folders, each corresponding to algorithm categories.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    JUDO
    JUDO is a Java IDE for children and beginning programmers. JUDO is designed to be an educational tool to teach programming concepts and to spark excitement and interest in programming.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Spark is an expressive, dynamic programming language. It is a multi-paradigm language with strong support for explorative programming. It is easy to learn and has an extensive library. Spark runs on GNU/Linux and Windows (Cygwin).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Spark is a Java library that converts data in Macromedias SWF ("Flash") data format to XML conforming to a specialized DTD and vice versa. The primary goal of Spark is to make it easier to work with SWF in a Java and XML based server environment.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    SPARKUnit is a unit test framework for the SPARK programming language. It enables developers to create unit tests which can be analysed by the SPARK Examiner. This allows for testing of operations with preconditions and flow analysis of test cases.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Spark is an s-expression-based language (like Lisp or Scheme) that will be built to be closer to the hardware (like C).
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
MongoDB Logo MongoDB