An Open Source text-to-speech system built by inverting Whisper
Bailing is a voice dialogue robot similar to GPT-4o
Production ready toolkit to run AI locally
Qwen3-ASR is an open-source series of ASR models
Large Audio Language Model built for natural interactions
A nearly-live implementation of OpenAI's Whisper
Use Microsoft Edge's online text-to-speech service from Python
Towards Human-Sounding Speech
Framework for building realtime multimodal voice AI agents apps
Virtual AI anchor that combines state-of-the-art technology
MARS5 speech model (TTS) from CAMB.AI
StreamSpeech is a seamless model for offline speech recognition
Open source AI VTuber platform with voice chat and Live2D avatars
Open Source Speech Language Model
Textream is a free macOS teleprompter app for streamers, interviewers
Lightning-fast, on-device TTS, running natively via ONNX
A TTS model capable of generating ultra-realistic dialogue
A single Gradio + React WebUI with extensions for ACE-Step
The media player for language learning, with dual subtitles
One-click deployment (including offline integration package)
C++ inference library for multiple SVC/TTS
Free & Easy AI Voice Accounting Software For Blind & Speechless People
Framework for building AI-powered interactive digital humans and agent
Interface for OuteTTS models
A Conversational Speech Generation Model