Search Results for "audio source separation"

Sort By:

3798 projects for "audio source separation" with 1 filter applied:

BSD Clear Filters & Widen Search

No-code automation to improve your process workflows
Pipefy is a digital automation software that centralizes data and standardizes workflows for teams like Finance and HR

Transform your financial and HR operations and improve efficiency even remotely with digital, customized workflows that your team can automate and integrate with other software without the need of IT development.

Try For Free
Safety Compliance Made Easy
SiteDocs is a digital safety management software used to support work site compliance.

Ideally designed for business that deals with Construction, Oil & Gas, Mining, Manufacturing, Mechanical, Electrical, Plumbing, Heating, and Excavating, SiteDocs is a perfect solution for any size business looking to modernize the way Safety Compliance is organized.

Learn More
1

Kimi-Audio

Audio foundation model excelling in audio understanding

Kimi-Audio is an ambitious open-source audio foundation model designed to unify a wide array of audio processing tasks — from speech recognition and audio understanding to generative conversation and sound event classification — within a single cohesive architecture. Instead of fragmenting work across specialized models, Kimi-Audio handles automatic speech recognition (ASR), audio question answering, automatic audio captioning, speech emotion recognition, and audio-to-text chat in one system, enabling developers to build rich, multimodal audio applications without stitching together disparate components. ...

Downloads: 1 This Week

Last Update: 2026-01-27
See Project
2

Fun Audio Chat

Large Audio Language Model built for natural interactions

Fun Audio Chat is an interactive voice-first conversational AI platform designed to let users engage in natural spoken dialogue with large language models in real time, turning speech into context-aware responses while maintaining a smooth back-and-forth experience. It combines speech recognition, audio processing, and AI generation so users can speak simply and receive spoken replies, enabling applications such as virtual assistants, voice bots, and hands-free chat interfaces. The system...

Downloads: 0 This Week

Last Update: 2026-02-27
See Project
3

Whisper-WebUI

A Web UI for easy subtitle using whisper model

Whisper WebUI is an open-source browser-based interface that simplifies the use of Whisper speech recognition models by providing an intuitive graphical environment for transcription, translation, and subtitle generation. Built with Gradio, it allows users to upload audio or video files, process them locally, and generate accurate text outputs without relying on command-line tools.

Downloads: 18 This Week

Last Update: 2026-03-18
See Project
4

TTS WebUI

A single Gradio + React WebUI with extensions for ACE-Step

TTS-WebUI is a unified Gradio + React web interface that brings together a large ecosystem of text-to-speech, voice conversion, and audio generation models under a single UI. It supports a wide range of models such as Bark, MusicGen, Tortoise, RVC, StyleTTS2, ParlerTTS, CosyVoice, XTTSv2, Stable Audio, SeamlessM4T, and many others, exposing them as interchangeable backends for speech and music synthesis. The project provides an installer that sets up Conda, Python environments, and all...

Downloads: 11 This Week

Last Update: 2026-04-05
See Project
Zendesk: The Complete Customer Service Solution
Discover AI-powered, award-winning customer service software trusted by 200k customers

Equip your agents with powerful AI tools and workflows that boost efficiency and elevate customer experiences across every channel.

Learn More
5

NeuralNote

Audio Plugin for Audio to MIDI transcription using deep learning

NeuralNote is an open-source audio software tool designed to convert recorded audio into MIDI data using modern machine learning techniques. The software functions as an audio plugin that can be used inside digital audio workstations as well as a standalone application for music production and analysis. Its main purpose is to perform audio-to-MIDI transcription, allowing musicians to record a performance and automatically transform it into editable MIDI notes. ...

Downloads: 83 This Week

Last Update: 2026-03-12
See Project
6

Caliban

Functional GraphQL library for Scala

Caliban is a purely functional library for building GraphQL servers and clients in Scala. The design principles behind the library are the following. Minimal amount of boilerplate: no need to manually define a schema for every type in your API. Pure interface: errors and effects are returned explicitly (no exceptions thrown), all returned types are referentially transparent (no Future). Clean separation between schema definition and implementation: schema is defined and validated at compile...

Downloads: 4 This Week

Last Update: 2026-01-11
See Project
7

LTX-2.3

Official Python inference and LoRA trainer package

LTX-2.3 is an open-source multimodal artificial intelligence foundation model developed by Lightricks for generating synchronized video and audio from prompts or other inputs. Unlike most earlier video generation systems that only produced silent clips, LTX-2 combines video and audio generation in a unified architecture capable of producing coherent audiovisual scenes.

Downloads: 138 This Week

Last Update: 2026-03-30
See Project
8

Writer Framework

No-code in the front, Python in the back. An open-source framework

Writer Framework is an open source platform designed to help developers build AI-powered applications by combining a visual interface builder with a Python-based backend architecture. It follows a hybrid approach where user interfaces are created using a drag-and-drop editor while business logic is implemented in Python, allowing teams to balance speed and flexibility without sacrificing control.

Downloads: 1 This Week

Last Update: 2026-04-09
See Project
9

Shairport Sync

AirPlay audio player

Shairport Sync adds multi-room capability with audio synchronization. Shairport Sync is an AirPlay 1 audio player. Switch to the development branch for a version with limited AirPlay 2 functionality. Shairport Sync plays audio streamed from iTunes, iOS, Apple TV and macOS devices and AirPlay sources such as Quicktime Player and OwnTone, among others. Audio played by a Shairport Sync-powered device stays synchronized with the source and hence with similar devices playing the same source. ...

Downloads: 5 This Week

Last Update: 2026-03-28
See Project
DeskTime is a cloud-based time tracking software
DeskTime is best for medium to large companies, as well as freelancers who want to boost productivity without overworking.

DeskTime is a high-performance, automated time tracking and workforce management solution for teams and freelancers. It runs silently in the background, logging computer activity from the moment of boot-up to ensure 100% accurate data without the need for manual timers.

Learn More
10

Laravel Framework

Elegant PHP web application framework with expressive syntax

Laravel is a popular open-source PHP web framework designed for building modern web applications with an elegant and readable syntax. It emphasizes developer productivity, offering features like routing, Eloquent ORM, blade templating, and an extensive ecosystem of packages. Laravel makes common tasks like authentication, caching, and session management simple and intuitive, making it a top choice for PHP developers.

Downloads: 12 This Week

Last Update: 3 days ago
See Project
11

Pothos GraphQL

Pothos GraphQL is library for creating GraphQL schemas in typescript

Pothos is a plugin based GraphQL schema builder for typescript. It makes building graphql schemas in typescript easy, fast and enjoyable. The core of Pothos adds 0 overhead at runtime, and has graphql as its only dependency. Pothos is the most type-safe way to build GraphQL schemas in typescript, and by leveraging type inference and typescript's powerful type system Pothos requires very few manual type definitions and no code generation. Pothos has a unique and powerful plugin system that...

Downloads: 4 This Week

Last Update: 3 days ago
See Project
12

frame.js

JavaScript Sequence Editor

frame.js is a tiny utility for orchestrating frame-based animations with requestAnimationFrame while keeping code clean and predictable. It abstracts the boilerplate of setting up a render loop, tracking elapsed time, and updating callbacks at the right cadence. By providing a simple lifecycle—start, stop, tick—it encourages separation between state updates and rendering, which is essential for smooth visuals. The library aims to be unobtrusive: you can drop it into demos or prototypes...

Downloads: 0 This Week

Last Update: 2025-10-24
See Project
13

AudioCraft

Audiocraft is a library for audio processing and generation

AudioCraft is a PyTorch library for text-to-audio and text-to-music generation, packaging research models and tooling for training and inference. It includes MusicGen for music generation conditioned on text (and optionally melody) and AudioGen for text-conditioned sound effects and environmental audio. Both models operate over discrete audio tokens produced by a neural codec (EnCodec), which acts like a tokenizer for waveforms and enables efficient sequence modeling. The repo provides...

Downloads: 6 This Week

Last Update: 2025-10-13
See Project
14

MusicPlayer2

Audio player that can play common audio formats

MusicPlayer2 is a simple music-player application (or prototype) implemented in — presumably — a web or desktop environment, intended to give users a clean, functional interface for managing and playing audio files. The project likely implements basic playlist management, playback controls (play, pause, skip), and possibly UI features to browse or organize music. Because many smaller music-player projects aim for simplicity, MusicPlayer2 may focus on providing a lightweight, minimal-dependency audio player compared to larger, heavy multimedia suites. ...

Downloads: 5 This Week

Last Update: 2025-12-27
See Project
15

HLS.js

HLS.js is a JavaScript library that plays HLS in browsers

HLS.js is a JavaScript library that implements an HTTP Live Streaming client. It relies on HTML5 video and MediaSource Extensions for playback. It works by transmuxing MPEG-2 Transport Stream and AAC/MP3 streams into ISO BMFF (MP4) fragments. Transmuxing is performed asynchronously using a Web Worker when available in the browser. HLS.js also supports HLS + fmp4, as announced during WWDC2016. HLS.js works directly on top of a standard HTML<video> element. HLS.js is written in ECMAScript6...

Downloads: 19 This Week

Last Update: 4 days ago
See Project
16

HeartMuLa

A Family of Open Sourced Music Foundation Models

...The project also includes HeartCodec, a music codec optimized for high reconstruction fidelity, enabling efficient tokenization and reconstruction workflows that are critical for training and generation pipelines. For text extraction from audio, it provides HeartTranscriptor, a Whisper-based model tuned specifically for lyrics transcription, which helps bridge generated or recorded audio back into structured text. It also introduces HeartCLAP, which aligns audio and text into a shared embedding space.

Downloads: 16 This Week

Last Update: 2026-04-10
See Project
17

AudioNotes

Extract audio and video content and organize it into a Markdown note

...As an open-source repository, AudioNotes provides developers or power users the opportunity to customize how audio is captured, stored, annotated, and replayed — e.g. adding playback speed control, export to standard formats, or synchronization between notes and audio timeline. It may support simple UI for starting/stopping recordings, writing or editing notes, and navigating through recorded sessions.

Downloads: 2 This Week

Last Update: 2025-12-04
See Project
18

NovaSR

A lightning fast audio upsampler

NovaSR is an extremely lightweight and high-performance audio upsampling model that transforms low-quality 16 kHz audio into clearer, high-fidelity 48 kHz audio with remarkable speed and efficiency. At only about 50 KB in size, the model is orders of magnitude smaller than typical audio super-resolution networks, yet it achieves high quality and realtime performance thanks to its compact architecture and efficient convolutional design. NovaSR is especially valuable for post-processing tasks...

Downloads: 2 This Week

Last Update: 2026-02-26
See Project
19

Oboe

Oboe is a C++ library that makes it easy to build high-performance

oboe is a C++ library for building high-performance audio apps on Android, providing a unified, low-latency API over AAudio and OpenSL ES. It abstracts device and API-version differences so developers can focus on audio processing instead of platform quirks. The library emphasizes minimal latency and glitch-free playback/recording via tuned buffer strategies and callback-driven I/O. It supports features like floating-point audio, channel configuration, sample-rate negotiation, and stream...

Downloads: 7 This Week

Last Update: 2025-10-09
See Project
20

Spring Integration

Provides an extension of the Spring programming model

Extends the Spring programming model to support the well-known Enterprise Integration Patterns. Spring Integration enables lightweight messaging within Spring-based applications and supports integration with external systems via declarative adapters. Those adapters provide a higher-level of abstraction over Spring’s support for remoting, messaging, and scheduling. Spring Integration’s primary goal is to provide a simple model for building enterprise integration solutions while maintaining...

Downloads: 8 This Week

Last Update: 2026-03-17
See Project
21

VoxCPM2

Tokenizer-Free TTS for Multilingual Speech Generation

...The system is trained on massive multilingual datasets, enabling support for dozens of languages and dialects while maintaining high fidelity and realism in generated audio. VoxCPM stands out for its ability to perform voice cloning with minimal input, capturing not only the speaker’s timbre but also nuanced features such as rhythm, accent, and emotional delivery. It also introduces voice design capabilities, allowing users to generate entirely new voices from natural language descriptions without requiring reference audio.

Downloads: 12 This Week

Last Update: 4 days ago
See Project
22

SFBAudioEngine

A powerhouse of audio functionality for macOS, iOS, and tvOS

SFBAudioEngine is an advanced audio engine designed for macOS and iOS, focusing on high-quality playback, precise audio control, and support for a wide range of audio formats. Built for modern Apple platforms, it provides developers with a robust tool for integrating sophisticated audio functionalities into their applications. It emphasizes extensibility, performance, and clean API design.

Downloads: 0 This Week

Last Update: 2026-02-27
See Project
23

WhisperJAV

Uses Qwen3-ASR, local LLM, Whisper, TEN-VAD

WhisperJAV is an open-source speech transcription pipeline designed specifically for generating subtitles for Japanese adult video content. The project addresses challenges that standard speech recognition models face when transcribing this type of audio, which often includes low signal-to-noise ratios and large numbers of non-verbal vocalizations. Traditional automatic speech recognition systems can misinterpret these sounds as words, leading to inaccurate transcripts. ...

Downloads: 20 This Week

Last Update: 2026-04-09
See Project
24

pyAudioAnalysis

Python Audio Analysis Library: Feature Extraction, Classification

pyAudioAnalysis is an open-source Python library designed for audio signal analysis, machine learning, and music information retrieval tasks. The project provides a collection of tools that allow developers to extract meaningful features from audio files and use those features for classification, segmentation, and analysis. The library supports multiple audio processing workflows, including feature extraction from raw audio signals, training of machine learning models, and automatic audio segmentation. ...

Downloads: 0 This Week

Last Update: 2026-03-10
See Project
25

EeveeSpotify

A tweak to enhance Spotify experience

EeveeSpotifyReborn is an unofficial modification for the Spotify mobile application that alters client-side behavior to unlock premium-like features without requiring a paid subscription. It operates by injecting changes into the Spotify app, making it interpret the user account as having premium access and enabling functionalities that are normally restricted. The project was developed through reverse engineering techniques, including analyzing application behavior and intercepting requests...

Downloads: 57 This Week

Last Update: 2026-03-23
See Project