ChatGPT interface with better UI
Generate Any 3D Scene in Seconds
Multi-modal large language model designed for audio understanding
Controllable & emotion-expressive zero-shot TTS
A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
Hunyuan Translation Model Version 1.5
High-resolution models for human tasks
Genome modeling and design across all domains of life
General-purpose image editing model that delivers high-fidelity
Ling-V2 is a MoE LLM provided and open-sourced by InclusionAI
This repository contains the official implementation of FastVLM
ICLR2024 Spotlight: curation/training code, metadata, distribution
Diffusion Transformer with Fine-Grained Chinese Understanding
LLM-based Reinforcement Learning audio edit model
Reproduction of Poetiq's record-breaking submission to the ARC-AGI-1
Unified Multimodal Understanding and Generation Models
Language modeling in a sentence representation space
The ChatGPT Retrieval Plugin lets you easily find personal documents
Pushing the Limits of Mathematical Reasoning in Open Language Models
Large Multimodal Models for Video Understanding and Editing
Qwen2.5-Coder is the code version of Qwen2.5, the large language model
Open-source, high-performance Mixture-of-Experts large language model
Chinese LLaMA-2 & Alpaca-2 Large Model Phase II Project
Open Multilingual Multimodal Chat LMs