Access to Anthropic's safety-first language model APIs
tiktoken is a fast BPE tokeniser for use with OpenAI's models
Repo for SeedVR2 & SeedVR
4M: Massively Multimodal Masked Modeling
A Powerful Native Multimodal Model for Image Generation
Block Diffusion for Ultra-Fast Speculative Decoding
Designed for text embedding and ranking tasks
Large Multimodal Models for Video Understanding and Editing
Repo of Qwen2-Audio chat & pretrained large audio language model
Collection of Gemma 3 variants that are trained for performance
Long-form streaming TTS system for multi-speaker dialogue generation
MiniMax-M2, a model built for Max coding & agentic workflows
Qwen3-VL, the multimodal large language model series by Alibaba Cloud
LTX-Video Support for ComfyUI
OCR expert VLM powered by Hunyuan's native multimodal architecture
LLM-based Reinforcement Learning audio edit model
Inference script for Oasis 500M
New set of lightweight state-of-the-art, open foundation models
Global weather forecasting model using graph neural networks and JAX
The official PyTorch implementation of Google's Gemma models
Instructions on how to use the Realtime API on Microcontrollers
Implementation of the Surya Foundation Model for Heliophysics
A SOTA open-source image editing model
Pretrained time-series foundation model developed by Google Research
Official implementation of DreamCraft3D