30 projects for "simd" with 1 filter applied:

  • Earn up to 16% annual interest with Nexo. Icon
    Earn up to 16% annual interest with Nexo.

    Access competitive interest rates on your digital assets.

    Generate interest, borrow against your crypto, and trade a range of cryptocurrencies — all in one platform. Geographic restrictions, eligibility, and terms apply.
    Get started with Nexo.
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI. Switch between models without switching platforms.
    Start Free
  • 1
    HighwayHash

    HighwayHash

    Fast strong hash functions: SipHash/HighwayHash

    HighwayHash is a fast, keyed hash function intended for scenarios where you need strong, DoS-resistant hashing without the full overhead of a general-purpose cryptographic hash. It’s designed to defeat hash-flooding attacks by mixing input with wide SIMD operations and a branch-free inner loop, so adversaries can’t cheaply craft many colliding keys. The implementation targets multiple CPU families with vectorized code paths while keeping a portable fallback, yielding high throughput across platforms. It exposes simple one-shot and streaming APIs, so you can hash short keys or long byte streams with the same function. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    Polars

    Polars

    Dataframes powered by a multithreaded, vectorized query engine

    Polars is a high-performance, multi-language DataFrame library built in Rust using Apache Arrow. It delivers blazing-fast, vectorized, and parallel data manipulation with both eager and lazy execution, making it an excellent tool for data processing in Python, Rust, Node.js, R, and SQL contexts.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 3
    wllama

    wllama

    WebAssembly binding for llama.cpp - Enabling on-browser LLM inference

    ...Built as a binding for the llama.cpp inference engine, the project allows developers to run LLM models locally without requiring a server backend or dedicated GPU hardware. The library leverages WebAssembly SIMD capabilities to achieve efficient execution within modern browsers while maintaining compatibility across platforms. By running models locally on the user’s device, wllama enables privacy-preserving AI applications that do not require sending data to remote servers. The framework provides both high-level APIs for common tasks such as text generation and embeddings, as well as low-level APIs that expose tokenization, sampling controls, and model state management.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 4
    Zerocopy

    Zerocopy

    Zerocopy makes zero-cost memory manipulation effortless

    Zerocopy is a Rust library designed to make zero-cost memory manipulation both safe and effortless. It allows developers to reinterpret or convert raw byte sequences into structured types—and vice versa—without writing unsafe code directly. The crate provides safe abstractions for transmuting data while preserving Rust’s strict safety guarantees, removing the need for manual memory manipulation. Zerocopy introduces a suite of conversion traits such as TryFromBytes, FromBytes, IntoBytes, and...
    Downloads: 9 This Week
    Last Update:
    See Project
  • Forever Free Full-Stack Observability | Grafana Cloud Icon
    Forever Free Full-Stack Observability | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • 5
    sleef

    sleef

    Vectorized libm

    SLEEF stands for SIMD Library for Evaluating Elementary Functions. SLEEF implements vectorized versions of all C99 math functions, that utilize SIMD instructions of modern processors to make computation more efficient. The library also includes vectorized DFT subroutines.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 6
    Armadillo

    Armadillo

    fast C++ library for linear algebra & scientific computing

    * Fast C++ library for linear algebra (matrix maths) and scientific computing * Easy to use functions and syntax, deliberately similar to Matlab / Octave * Uses template meta-programming techniques to increase efficiency * Provides user-friendly wrappers for OpenBLAS, Intel MKL, LAPACK, ATLAS, ARPACK, SuperLU and FFTW libraries * Useful for machine learning, pattern recognition, signal processing, bioinformatics, statistics, finance, etc. * Downloads:...
    Leader badge
    Downloads: 2,730 This Week
    Last Update:
    See Project
  • 7

    Fosite - advection problem solver

    numerical simulation code for solving transport equations in 1D/2D/3D

    ...Fosite is written with object-oriented patterns in Fortran 2003 and follows the Structure of Arrays (SoA) layout, operating on generic field datatypes. This allows for high performance on modern architectures (SIMD). It is parallelized and vectorized. The software is thereby optimized for the NEC SX-Aurora TSUBASA Vector Engine .
    Downloads: 2 This Week
    Last Update:
    See Project
  • 8

    libsombrero

    Astronomical object/structure detection from 1D and 2D data sets.

    Sombrero is a fast wavelet image processing and object detection C library for astronomical images. Sombrero is named after the "Mexican Hat" shape of the wavelet masks used in image convolution and is released under the GNU LGPL library.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    libjpeg-turbo

    libjpeg-turbo

    SIMD-accelerated libjpeg-compatible JPEG codec library

    libjpeg-turbo is a JPEG image codec that uses SIMD instructions (MMX, SSE2, NEON, AltiVec) to accelerate baseline JPEG compression and decompression on x86, x86-64, ARM, and PowerPC systems. On such systems, libjpeg-turbo is generally 2-6x as fast as libjpeg, all else being equal. On other types of systems, libjpeg-turbo can still outperform libjpeg by a significant amount, by virtue of its highly-optimized Huffman coding routines.
    Leader badge
    Downloads: 44,917 This Week
    Last Update:
    See Project
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • 10
    TurboPFor

    TurboPFor

    Fastest Integer Compression

    Fastest Integer Compression. ALL functions are available for AMD/Intel, 64-bit ARMv8 NEON Linux+MacOS/M1 & Power9 Altivec. 100% C (C++ headers), as simple as memcpy. OS:Linux amd64, arm64, Power9, MacOs (Amd/intel + Apple M1).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11

    dispy

    Distributed and Parallel Computing with/for Python.

    dispy is a generic and comprehensive, yet easy to use framework for creating and using compute clusters to execute computations in parallel across multiple processors in a single machine (SMP), among many machines in a cluster, grid or cloud. dispy is well suited for data parallel (SIMD) paradigm where a computation (Python function or standalone program) is evaluated with different (large) datasets independently. dispy supports public / private / hybrid cloud computing, fog / edge computing.
    Leader badge
    Downloads: 12 This Week
    Last Update:
    See Project
  • 12

    LightPCC

    Parallel pairwise correlation computation on Intel Xeon Phi clusters

    The first parallel and distributed library for pairwise correlation/dependence computation on Intel Xeon Phi clusters. This library is written in C++ template classes and achieves high speed by exploring the SIMD-instruction-level and thread-level parallelism within Xeon Phis as well as accelerator-level parallelism among multiple Xeon Phis. To facilitate balanced workload distribution, we have proposed a general framework for symmetric all-pairs computation by building provable bijective functions between job identifier and coordinate space for the first time.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13

    smartIDS

    Lightweight intrusion detection for IoT and embedded devices.

    The aim of the project is a lightweight intrusion detection library for embedded devices which supports MSP430 and ARM Cortex based devices. Features include DSP/SIMD support, IoT and embedded protocols, distributed operation, event and history management, tool supported configuration and visualization. There is a Java port that supports less features.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14

    RandomLib

    Random number library

    RandomLib is a C++ interface to the Mersenne Twister random number generator MT19937 and to the SIMD-oriented Fast Mersenne Twister random number generator, SFMT19937. For documentation, visit http://randomlib.sf.net
    Downloads: 1 This Week
    Last Update:
    See Project
  • 15

    SWAPHI-LS: Alignment on Xeon Phi Cluster

    Smith-Waterman long DNA sequence alignment on Xeon Phi clusters

    The first parallel Smith-Waterman algorithm exploiting Intel Xeon Phi clusters to accelerate the alignment of long DNA sequences. This algorithm is written in C++ (with a set of SIMD intrinsic extensions), OpenMP and MPI. The performance evaluation revealed that our algorithm achieves very stable performance, and yields a performance of up to 30.1 GCUPS on a single Xeon Phi and up to 111.4 GCUPS on four Xeon Phis sharing a host.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Mathematical library utilising SIMD features of common processors to accelerate many commonly-used algorithms where compilers fear to tread.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 17
    Virtual Lighttable and Darkroom
    Darktable is a virtual lighttable and darkroom for photographers: it manages your digital negatives in a database and lets you view them through a zoomable light table. It also enables you to develop raw images and enhance them.
    Downloads: 35 This Week
    Last Update:
    See Project
  • 18

    Block Matrix library

    Highly efficient implementation of BLAS for sparse block matrices.

    Highly efficient implementation of BLAS for sparse block matrices. Accelerated using heavy-duty C++ meta-programming, SIMD instructions and GPU.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Vector3D SSE
    A C++ header library for fast operations on vectors/matrices (3D/3x3) using Streaming SIMD Extensions (SSE, SSE2, SSE3, SSE4); Tends to be used in 3D graphics applications and game developement.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    A cross-platform library that computes fast and accurate SIFT image features. libsiftfast provides Octave/Matlab scripts, a command line interface, and a python interface (siftfastpy). Optimized with SIMD instructions and OpenMP .
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    An efficient implementation of the Smith-Waterman algorithm that takes advantage of SIMD instruction sets in modern CPUs. The Smith-Waterman algorithm is used for sequence alignment in bioinformatics.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    SSEPlus is a SIMD function library. It provides optimized emulation for newer SSE instructions. It also provides a rich set of high performance routines for common operations such as arithmetic, bitwise logic, and data packing and unpacking.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    A data parallel scientific programming model. Compiles efficiently to different platforms like distributed memory (MPI), shared memory multi-processor (pthreads), Cell BE processor, Nvidia Cuda, SIMD vectorization (SSE, Altivec), and sequential C++ code.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    GENIAL is a C++ library for signal and image processing. It uses template-generic techniques, multi threading, cache optimization and SIMD instructions for Pentium (MMX, SSE, SSE2, SSE3) to achieve high performance: FFT,DCT,Convolution,Linear Algebra...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Low-level processor benchmark for x86 and amd64 processors: measures exact latency and throughput for each assembly instruction, and automatically finds execution units. Special focus on SIMD (MMX, SSE) instructions.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next
MongoDB Logo MongoDB