python music free download

Showing 9 open source projects for "python music"

View related business solutions

AI Models Python Clear Filters & Widen Search

Comet Backup - Fast, Secure Backup Software for MSPs
Fast, Secure Backup Software for Businesses and IT Providers

Comet is a flexible backup platform, giving you total control over your backup environment and storage destinations.

Learn More
The AI-powered unified PSA-RMM platform for modern MSPs.
Trusted PSA-RMM partner of MSPs worldwide

SuperOps.ai is the only PSA-RMM platform powered by intelligent automation and thoughtfully crafted for the new-age MSP. The platform also helps MSPs manage their projects, clients, and IT documents from a single place.

Learn More
1

ACE-Step 1.5

The most powerful local music generation model

ACE-Step 1.5 is an advanced open-source foundation model for AI-driven music generation that pushes beyond traditional limitations in speed, musical coherence, and controllability by innovating in architecture and training design. It integrates cutting-edge generative techniques—such as diffusion-based synthesis combined with compressed autoencoders and lightweight transformer elements—to produce high-quality full-length music tracks with rapid inference times, capable of generating a...

Downloads: 80 This Week

Last Update: 2026-04-02
See Project
2

HeartMuLa

A Family of Open Sourced Music Foundation Models

HeartMuLa is the open-source library and reference implementation for the HeartMuLa family of music foundation models, designed to support both music generation and music-related understanding tasks in a cohesive stack. At the center is HeartMuLa, a music language model that generates music conditioned on inputs like lyrics and tags, with multilingual support that broadens the range of lyric-driven use cases. The project also includes HeartCodec, a music codec optimized for high...

Downloads: 13 This Week

Last Update: 2026-04-10
See Project
3

LTX-2.3

Official Python inference and LoRA trainer package

LTX-2.3 is an open-source multimodal artificial intelligence foundation model developed by Lightricks for generating synchronized video and audio from prompts or other inputs. Unlike most earlier video generation systems that only produced silent clips, LTX-2 combines video and audio generation in a unified architecture capable of producing coherent audiovisual scenes. The model uses a diffusion-transformer-based architecture designed to generate high-fidelity visual frames while...

Downloads: 189 This Week

Last Update: 2026-04-13
See Project
4

HunyuanVideo-Foley

Multimodal Diffusion with Representation Alignment

HunyuanVideo-Foley is a multimodal diffusion model from Tencent Hunyuan for high-fidelity Foley (sound effects) audio generation synchronized to video scenes. It is designed to generate audio that matches both visual content and textual semantic cues, for use in video production, film, advertising, games, etc. The model architecture aligns audio, video, and text representations to produce realistic synchronized soundtracks. Produces high-quality 48 kHz audio output suitable for professional...

Downloads: 3 This Week

Last Update: 2025-09-28
See Project
Network Management Software and Tools for Businesses and Organizations | Auvik Networks
Mapping, inventory, config backup, and more.

Reduce IT headaches and save time with a proven solution for automated network discovery, documentation, and performance monitoring. Choose Auvik because you'll see value in minutes, and stay with us to improve your IT for years to come.

Learn More
5

Qwen-Audio

Chat & pretrained large audio language model proposed by Alibaba Cloud

Qwen-Audio is a large audio-language model developed by Alibaba Cloud, built to accept various types of audio input (speech, natural sounds, music, singing) along with text input, and output text. There is also an instruction-tuned version called Qwen-Audio-Chat which supports conversational interaction (multi-round), audio + text input, creative tasks and reasoning over audio. It uses multi-task training over many different audio tasks (30+), and achieves strong multi-benchmarks performance...

Downloads: 1 This Week

Last Update: 2025-09-23
See Project
6

Kimi-Audio

Audio foundation model excelling in audio understanding

Kimi-Audio is an ambitious open-source audio foundation model designed to unify a wide array of audio processing tasks — from speech recognition and audio understanding to generative conversation and sound event classification — within a single cohesive architecture. Instead of fragmenting work across specialized models, Kimi-Audio handles automatic speech recognition (ASR), audio question answering, automatic audio captioning, speech emotion recognition, and audio-to-text chat in one...

Downloads: 2 This Week

Last Update: 2026-01-27
See Project
7

Qwen-2.5-VL

Qwen2.5-VL is the multimodal large language model series

Qwen2.5 is a series of large language models developed by the Qwen team at Alibaba Cloud, designed to enhance natural language understanding and generation across multiple languages. The models are available in various sizes, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B parameters, catering to diverse computational requirements. Trained on a comprehensive dataset of up to 18 trillion tokens, Qwen2.5 models exhibit significant improvements in instruction following, long-text generation...

Downloads: 5 This Week

Last Update: 2026-01-30
See Project
8

DiffRhythm

Di♪♪Rhythm: Blazingly Fast & Simple End-to-End Song Generation

DiffRhythm is an open-source, diffusion-based model designed to generate full-length songs. Focused on music creation, it combines advanced AI techniques to produce coherent and creative audio compositions. The model utilizes a latent diffusion architecture, making it capable of producing high-quality, long-form music. It can be accessed on Huggingface, where users can interact with a demo or download the model for further use. DiffRhythm offers tools for both training and inference, and its...

1 Review

Downloads: 5 This Week

Last Update: 2025-03-06
See Project
9

Demucs

Code for the paper Hybrid Spectrogram and Waveform Source Separation

Demucs (Deep Extractor for Music Sources) is a deep-learning framework for music source separation—extracting individual instrument or vocal tracks from a mixed audio file. The system is based on a U-Net-like convolutional architecture combined with recurrent and transformer elements to capture both short-term and long-term temporal structure. It processes raw waveforms directly rather than spectrograms, allowing for higher-quality reconstruction and fewer artifacts in separated tracks. The...

Downloads: 102 This Week

Last Update: 2025-10-12
See Project
Run applications fast and securely in a fully managed environment
Cloud Run is a fully-managed compute platform that lets you run your code in a container directly on top of scalable infrastructure.

Run frontend and backend services, batch jobs, deploy websites and applications, and queue processing workloads without the need to manage infrastructure.

Try for free