TTS for Context-Aware Speech Generation and True-to-Life Voice Cloning
PersonaPlex code
Fast multimodal LLM for real-time voice interaction and AI apps
One-click deployment (including offline integration package)
State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX
Automated Music Discovery and Collection Manager
Generate audiobooks from e-books, voice cloning & 1107+ languages
Translate the video from one language to another and embed dubbing
LLM Large Model of Selling Anchor
An Open Source implementation of Notebook LM with more flexibility
Generate high-definition story short videos with one click using AI
Code and models for ICML 2024 paper, NExT-GPT
A python tool that uses GPT-4, FFmpeg, and OpenCV
AudioMuse-AI is an Open Source Dockerized environment
A Web UI for easy subtitle using whisper model
Official MiniMax Model Context Protocol (MCP) server
Toolkit for audio, music, and speech generation
Towards Human-Sounding Speech
Speakr is a personal, self-hosted web application
Multi-lingual large voice generation model, providing inference
A lightweight text-to-speech model with zero-shot voice cloning
Instill Core is a full-stack AI infrastructure tool for data
Multimodal-Driven Architecture for Customized Video Generation
Interface for OuteTTS models
Capable of understanding text, audio, vision, video