Performance meets Productivity
The CUDA target for Numba
Thin, unified, C++-flavored wrappers for the CUDA APIs
CUDA Core Compute Libraries
A NumPy-compatible array library accelerated by CUDA
Lightning fast C++/CUDA neural network framework
Build an automated pipeline that converts CUDA APIs into Numba
A Python framework for accelerated simulation, data generation
Development repository for the Triton language and compiler
OpenCV wrapper for .NET
A language for fast, portable data-parallel computation
C++ library for high performance inference on NVIDIA GPUs
The official SuiteSparse library: a suite of sparse matrix algorithms
GPU DataFrame Library
Fast Differentiable Tensor Library in JavaScript & TypeScript with Bun
Geometric deep learning extension library for PyTorch
Jittor is a high-performance deep learning framework
A fast compiler cache
Multi-platform high-performance compute language extension for Rust
NVIDIA GPU Operator creates/configures/manages GPUs atop Kubernetes
2D and 3D Face alignment library build using pytorch
ArrayFire, a general purpose GPU library
A data-parallel functional programming language
Library for efficient similarity search and clustering dense vectors
A set of Docker images for training and serving models in TensorFlow