Showing 198 open source projects for "transformer"

View related business solutions
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI. Switch between models without switching platforms.
    Start Free
  • Earn up to 16% annual interest with Nexo. Icon
    Earn up to 16% annual interest with Nexo.

    Access competitive interest rates on your digital assets.

    Generate interest, borrow against your crypto, and trade a range of cryptocurrencies — all in one platform. Geographic restrictions, eligibility, and terms apply.
    Get started with Nexo.
  • 1
    Transformer Engine

    Transformer Engine

    A library for accelerating Transformer models on NVIDIA GPUs

    Transformer Engine (TE) is a library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper GPUs, to provide better performance with lower memory utilization in both training and inference. TE provides a collection of highly optimized building blocks for popular Transformer architectures and an automatic mixed precision-like API that can be used seamlessly with your framework-specific code.
    Downloads: 30 This Week
    Last Update:
    See Project
  • 2
    Transformer Explainer

    Transformer Explainer

    Learn How LLM Transformer Models Work with Interactive Visualization

    Transformer Explainer is an interactive visualization tool created to help users understand how transformer-based language models operate internally. The platform runs a lightweight GPT-2 model directly in the user’s browser and allows users to experiment with text prompts while observing the model’s internal operations. Through visual diagrams and interactive interfaces, the tool reveals how tokens are processed through layers such as embeddings, attention mechanisms, and feed-forward networks. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 3
    Transformer Debugger

    Transformer Debugger

    Tool for exploring and debugging transformer model behaviors

    Transformer Debugger (TDB) is a research tool developed by OpenAI’s Superalignment team to investigate and interpret the behaviors of small language models. It combines automated interpretability methods with sparse autoencoders, enabling researchers to analyze how specific neurons, attention heads, and latent features contribute to a model’s outputs.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    Vision Transformer Pytorch

    Vision Transformer Pytorch

    Implementation of Vision Transformer, a simple way to achieve SOTA

    ...It’s widely used as an educational reference for people learning transformers in vision and as a lightweight baseline for research prototypes. The project encourages experimentation—swap optimizers, change augmentations, or plug the transformer backbone into downstream tasks.
    Downloads: 8 This Week
    Last Update:
    See Project
  • $300 in Free Credit Towards Top Cloud Services Icon
    $300 in Free Credit Towards Top Cloud Services

    Build VMs, containers, AI, databases, storage—all in one place.

    Start your project in minutes. After credits run out, 20+ products include free monthly usage. Only pay when you're ready to scale.
    Get Started
  • 5
    Transformers.jl

    Transformers.jl

    Julia Implementation of Transformer models

    Transformers.jl is a Julia library that implements Transformer models for natural language processing tasks. Inspired by architectures like BERT, GPT, and T5, the library offers a modular and flexible interface for building, training, and using transformer-based deep learning models. It supports training from scratch and fine-tuning pretrained models, and integrates with Flux.jl for automatic differentiation and optimization.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 6
    CTranslate2

    CTranslate2

    Fast inference engine for Transformer models

    CTranslate2 is a C++ and Python library for efficient inference with Transformer models. The project implements a custom runtime that applies many performance optimization techniques such as weights quantization, layers fusion, batch reordering, etc., to accelerate and reduce the memory usage of Transformer models on CPU and GPU. The execution is significantly faster and requires less resources than general-purpose deep learning frameworks on supported models and tasks thanks to many advanced optimizations: layer fusion, padding removal, batch reordering, in-place operations, caching mechanism, etc. ...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 7
    RF-DETR

    RF-DETR

    RF-DETR is a real-time object detection and segmentation

    RF-DETR is an open-source computer vision framework that implements a real-time object detection and instance segmentation model based on transformer architectures. Developed by Roboflow, the project builds upon modern vision transformer backbones such as DINOv2 to achieve strong accuracy while maintaining efficient inference speeds suitable for real-time applications. The model is designed to detect objects and segment them within images or video streams using a unified detection pipeline. ...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 8
    Megatron

    Megatron

    Ongoing research training transformer models at scale

    Megatron is a large, powerful transformer developed by the Applied Deep Learning Research team at NVIDIA. This repository is for ongoing research on training large transformer language models at scale. We developed efficient, model-parallel (tensor, sequence, and pipeline), and multi-node pre-training of transformer based models such as GPT, BERT, and T5 using mixed precision.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Curated Transformers

    Curated Transformers

    PyTorch library of curated Transformer models and their components

    State-of-the-art transformers, brick by brick. Curated Transformers is a transformer library for PyTorch. It provides state-of-the-art models that are composed of a set of reusable components. Supports state-of-the-art transformer models, including LLMs such as Falcon, Llama, and Dolly v2. Implementing a feature or bugfix benefits all models. For example, all models support 4/8-bit inference through the bitsandbytes library and each model can use the PyTorch meta device to avoid unnecessary allocations and initialization.
    Downloads: 9 This Week
    Last Update:
    See Project
  • Custom VMs From 1 to 96 vCPUs With 99.95% Uptime Icon
    Custom VMs From 1 to 96 vCPUs With 99.95% Uptime

    General-purpose, compute-optimized, or GPU/TPU-accelerated. Built to your exact specs.

    Live migration and automatic failover keep workloads online through maintenance. One free e2-micro VM every month.
    Try Free
  • 10
    Pyreft

    Pyreft

    ReFT: Representation Finetuning for Language Models

    PyreFT is a tool by Stanford NLP for fine-tuning transformer models with an emphasis on efficient, resource-conserving training and customizability for NLP tasks.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 11
    Detoxify

    Detoxify

    Trained models & code to predict toxic comments

    Detoxify is a deep learning-based tool for detecting and filtering toxic language in online conversations, leveraging Transformer models for high accuracy.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 12
    model2Vec

    model2Vec

    Fast State-of-the-Art Static Embeddings

    ...The resulting models can be used for a wide range of tasks, including semantic search, clustering, classification, and retrieval-augmented generation systems. One of its key advantages is its simplicity, as it requires minimal dependencies and can generate embeddings extremely quickly compared to traditional transformer-based approaches.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 13
    ACE-Step 1.5

    ACE-Step 1.5

    The most powerful local music generation model

    ACE-Step 1.5 is an advanced open-source foundation model for AI-driven music generation that pushes beyond traditional limitations in speed, musical coherence, and controllability by innovating in architecture and training design. It integrates cutting-edge generative techniques—such as diffusion-based synthesis combined with compressed autoencoders and lightweight transformer elements—to produce high-quality full-length music tracks with rapid inference times, capable of generating a complete song in seconds on modern GPUs while remaining efficient enough to run on consumer-grade hardware with minimal memory requirements. Beyond straightforward text-to-music synthesis, ACE-Step 1.5 enables flexible creative workflows, including tasks like cover generation, editing existing tracks, transforming vocals to background accompaniment, and stylistic personalization using low-rank adaptation from just a few example songs.
    Downloads: 123 This Week
    Last Update:
    See Project
  • 14
    MoBA

    MoBA

    MoBA: Mixture of Block Attention for Long-Context LLMs

    ...The approach allows language models to scale to significantly longer input contexts without the quadratic computational cost normally associated with transformer attention mechanisms.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    SeedVR

    SeedVR

    Repo for SeedVR2 & SeedVR

    ...SeedVR’s transformer-based design allows it to handle variable frame resolutions and lengths, and its architecture is optimized to overcome traditional limitations of windowed attention in high-resolution contexts.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 16
    Z-Image

    Z-Image

    Image generation model with single-stream diffusion transformer

    Z-Image is an efficient, open-source image generation foundation model built to make high-quality image synthesis more accessible. With just 6 billion parameters — far fewer than many large-scale models — it uses a novel “single-stream diffusion Transformer” architecture to deliver photorealistic image generation, demonstrating that excellence does not always require extremely large model sizes. The project includes several variants: Z-Image-Turbo, a distilled version optimized for speed and low resource consumption; Z-Image-Base, the full-capacity foundation model; and Z-Image-Edit, fine-tuned for image editing tasks. ...
    Downloads: 27 This Week
    Last Update:
    See Project
  • 17
    FlashAttention

    FlashAttention

    Fast and memory-efficient exact attention

    FlashAttention is a high-performance deep learning optimization library that reimplements the attention mechanism used in transformer models to be significantly faster and more memory-efficient than standard implementations. It achieves this by using IO-aware algorithms that minimize memory reads and writes, reducing the quadratic memory overhead typically associated with attention operations. The project provides implementations of FlashAttention, FlashAttention-2, and newer iterations optimized for modern GPU architectures such as NVIDIA Hopper and AMD accelerators. ...
    Downloads: 46 This Week
    Last Update:
    See Project
  • 18
    xFormers

    xFormers

    Hackable and optimized Transformers building blocks

    xformers is a modular, performance-oriented library of transformer building blocks, designed to allow researchers and engineers to compose, experiment, and optimize transformer architectures more flexibly than monolithic frameworks. It abstracts components like attention layers, feedforward modules, normalization, and positional encoding, so you can mix and match or swap optimized kernels easily.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 19
    TabPFN

    TabPFN

    Foundation Model for Tabular Data

    TabPFN is an open-source machine learning system that introduces a foundation model designed specifically for tabular data analysis. The model is based on transformer architectures and implements a prior-data fitted network that can perform supervised learning tasks such as classification and regression with minimal configuration. Unlike many traditional machine learning workflows that require extensive hyperparameter tuning and training cycles, TabPFN is pre-trained to perform inference directly on tabular datasets. ...
    Downloads: 10 This Week
    Last Update:
    See Project
  • 20
    Intel LLM Library for PyTorch

    Intel LLM Library for PyTorch

    Accelerate local LLM inference and finetuning

    Intel LLM Library for PyTorch is an open-source acceleration library developed to optimize large language model inference and fine-tuning on Intel hardware platforms. Built as an extension of the PyTorch ecosystem, the library enables developers to run modern transformer models efficiently on Intel CPUs, GPUs, and specialized AI accelerators. The framework provides hardware-aware optimizations and low-precision computation techniques that significantly improve the performance of large language models while reducing memory consumption. IPEX-LLM supports a wide range of popular models, including architectures such as LLaMA, Mistral, Qwen, and other transformer-based systems. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 21
    HunyuanDiT

    HunyuanDiT

    Diffusion Transformer with Fine-Grained Chinese Understanding

    HunyuanDiT is a high-capability text-to-image diffusion transformer with bilingual (Chinese/English) understanding and multi-turn dialogue capability. It trains a diffusion model in latent space using a transformer backbone and integrates a Multimodal Large Language Model (MLLM) to refine captions and support conversational image generation. It supports adapters like ControlNet, IP-Adapter, LoRA, and can run under constrained VRAM via distillation versions.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 22
    TRIBE v2

    TRIBE v2

    A multimodal model for brain response prediction

    ...It is designed for in-silico neuroscience, enabling researchers to model how the brain responds to complex real-world inputs. The system integrates state-of-the-art encoders—including LLaMA for text, V-JEPA for video, and Wav2Vec-BERT for audio—into a unified Transformer architecture. This combined representation is mapped onto the cortical surface to predict fMRI responses across thousands of brain regions. TRIBE v2 allows researchers to simulate and analyze brain activity without requiring direct human experiments. Overall, it provides a powerful tool for studying perception, cognition, and multimodal processing in the brain.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 23
    SageAttention

    SageAttention

    NeurIPS2025 Spotlight] Quantized Attention

    SageAttention is an open-source optimization library designed to accelerate the attention mechanism used in transformer-based neural networks. Since attention operations are often the most computationally expensive component of modern AI models, SageAttention introduces quantization techniques that significantly reduce computational overhead while preserving model accuracy. The system achieves this by using low-precision numerical formats such as INT4, FP8, or INT8 to represent key matrices within the attention computation. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    x-transformers

    x-transformers

    A simple but complete full-attention transformer

    A simple but complete full-attention transformer with a set of promising experimental features from various papers. Proposes adding learned memory key/values prior to attending. They were able to remove feedforwards altogether and attain a similar performance to the original transformers. I have found that keeping the feedforwards and adding the memory key/values leads to even better performance.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 25
    BitNet

    BitNet

    BitNet: Scaling 1-bit Transformers for Large Language Models

    BitNet is a machine learning research implementation that explores extremely low-precision neural network architectures designed to dramatically reduce the computational cost of large language models. The project implements the BitNet architecture described in research on scaling transformer models using extremely low-bit quantization techniques. In this approach, neural network weights are quantized to approximately one bit per parameter, allowing models to operate with far lower memory usage than traditional 16-bit or 32-bit neural networks. The architecture introduces specialized layers such as BitLinear, which replace standard linear projections in transformer networks with quantized operations. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next
MongoDB Logo MongoDB