Search Results for "audio source separation"

3798 projects for "audio source separation" with 1 filter applied:

  • No-code automation to improve your process workflows Icon
    No-code automation to improve your process workflows

    Pipefy is a digital automation software that centralizes data and standardizes workflows for teams like Finance and HR

    Transform your financial and HR operations and improve efficiency even remotely with digital, customized workflows that your team can automate and integrate with other software without the need of IT development.
    Try For Free
  • Safety Compliance Made Easy Icon
    Safety Compliance Made Easy

    SiteDocs is a digital safety management software used to support work site compliance.

    Ideally designed for business that deals with Construction, Oil & Gas, Mining, Manufacturing, Mechanical, Electrical, Plumbing, Heating, and Excavating, SiteDocs is a perfect solution for any size business looking to modernize the way Safety Compliance is organized.
    Learn More
  • 1
    Kimi-Audio

    Kimi-Audio

    Audio foundation model excelling in audio understanding

    Kimi-Audio is an ambitious open-source audio foundation model designed to unify a wide array of audio processing tasks — from speech recognition and audio understanding to generative conversation and sound event classification — within a single cohesive architecture. Instead of fragmenting work across specialized models, Kimi-Audio handles automatic speech recognition (ASR), audio question answering, automatic audio captioning, speech emotion recognition, and audio-to-text chat in one system, enabling developers to build rich, multimodal audio applications without stitching together disparate components. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 2
    Fun Audio Chat

    Fun Audio Chat

    Large Audio Language Model built for natural interactions

    Fun Audio Chat is an interactive voice-first conversational AI platform designed to let users engage in natural spoken dialogue with large language models in real time, turning speech into context-aware responses while maintaining a smooth back-and-forth experience. It combines speech recognition, audio processing, and AI generation so users can speak simply and receive spoken replies, enabling applications such as virtual assistants, voice bots, and hands-free chat interfaces. The system...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Whisper-WebUI

    Whisper-WebUI

    A Web UI for easy subtitle using whisper model

    Whisper WebUI is an open-source browser-based interface that simplifies the use of Whisper speech recognition models by providing an intuitive graphical environment for transcription, translation, and subtitle generation. Built with Gradio, it allows users to upload audio or video files, process them locally, and generate accurate text outputs without relying on command-line tools.
    Downloads: 18 This Week
    Last Update:
    See Project
  • 4
    TTS WebUI

    TTS WebUI

    A single Gradio + React WebUI with extensions for ACE-Step

    TTS-WebUI is a unified Gradio + React web interface that brings together a large ecosystem of text-to-speech, voice conversion, and audio generation models under a single UI. It supports a wide range of models such as Bark, MusicGen, Tortoise, RVC, StyleTTS2, ParlerTTS, CosyVoice, XTTSv2, Stable Audio, SeamlessM4T, and many others, exposing them as interchangeable backends for speech and music synthesis. The project provides an installer that sets up Conda, Python environments, and all...
    Downloads: 11 This Week
    Last Update:
    See Project
  • Zendesk: The Complete Customer Service Solution Icon
    Zendesk: The Complete Customer Service Solution

    Discover AI-powered, award-winning customer service software trusted by 200k customers

    Equip your agents with powerful AI tools and workflows that boost efficiency and elevate customer experiences across every channel.
    Learn More
  • 5
    NeuralNote

    NeuralNote

    Audio Plugin for Audio to MIDI transcription using deep learning

    NeuralNote is an open-source audio software tool designed to convert recorded audio into MIDI data using modern machine learning techniques. The software functions as an audio plugin that can be used inside digital audio workstations as well as a standalone application for music production and analysis. Its main purpose is to perform audio-to-MIDI transcription, allowing musicians to record a performance and automatically transform it into editable MIDI notes. ...
    Downloads: 83 This Week
    Last Update:
    See Project
  • 6
    Caliban

    Caliban

    Functional GraphQL library for Scala

    Caliban is a purely functional library for building GraphQL servers and clients in Scala. The design principles behind the library are the following. Minimal amount of boilerplate: no need to manually define a schema for every type in your API. Pure interface: errors and effects are returned explicitly (no exceptions thrown), all returned types are referentially transparent (no Future). Clean separation between schema definition and implementation: schema is defined and validated at compile...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 7
    LTX-2.3

    LTX-2.3

    Official Python inference and LoRA trainer package

    LTX-2.3 is an open-source multimodal artificial intelligence foundation model developed by Lightricks for generating synchronized video and audio from prompts or other inputs. Unlike most earlier video generation systems that only produced silent clips, LTX-2 combines video and audio generation in a unified architecture capable of producing coherent audiovisual scenes.
    Downloads: 138 This Week
    Last Update:
    See Project
  • 8
    Writer Framework

    Writer Framework

    No-code in the front, Python in the back. An open-source framework

    Writer Framework is an open source platform designed to help developers build AI-powered applications by combining a visual interface builder with a Python-based backend architecture. It follows a hybrid approach where user interfaces are created using a drag-and-drop editor while business logic is implemented in Python, allowing teams to balance speed and flexibility without sacrificing control.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 9
    Shairport Sync

    Shairport Sync

    AirPlay audio player

    Shairport Sync adds multi-room capability with audio synchronization. Shairport Sync is an AirPlay 1 audio player. Switch to the development branch for a version with limited AirPlay 2 functionality. Shairport Sync plays audio streamed from iTunes, iOS, Apple TV and macOS devices and AirPlay sources such as Quicktime Player and OwnTone, among others. Audio played by a Shairport Sync-powered device stays synchronized with the source and hence with similar devices playing the same source. ...
    Downloads: 5 This Week
    Last Update:
    See Project
  • DeskTime is a cloud-based time tracking software Icon
    DeskTime is a cloud-based time tracking software

    DeskTime is best for medium to large companies, as well as freelancers who want to boost productivity without overworking.

    DeskTime is a high-performance, automated time tracking and workforce management solution for teams and freelancers. It runs silently in the background, logging computer activity from the moment of boot-up to ensure 100% accurate data without the need for manual timers.
    Learn More
  • 10
    Laravel Framework

    Laravel Framework

    Elegant PHP web application framework with expressive syntax

    Laravel is a popular open-source PHP web framework designed for building modern web applications with an elegant and readable syntax. It emphasizes developer productivity, offering features like routing, Eloquent ORM, blade templating, and an extensive ecosystem of packages. Laravel makes common tasks like authentication, caching, and session management simple and intuitive, making it a top choice for PHP developers.
    Downloads: 12 This Week
    Last Update:
    See Project
  • 11
    Pothos GraphQL

    Pothos GraphQL

    Pothos GraphQL is library for creating GraphQL schemas in typescript

    Pothos is a plugin based GraphQL schema builder for typescript. It makes building graphql schemas in typescript easy, fast and enjoyable. The core of Pothos adds 0 overhead at runtime, and has graphql as its only dependency. Pothos is the most type-safe way to build GraphQL schemas in typescript, and by leveraging type inference and typescript's powerful type system Pothos requires very few manual type definitions and no code generation. Pothos has a unique and powerful plugin system that...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 12
    frame.js

    frame.js

    JavaScript Sequence Editor

    frame.js is a tiny utility for orchestrating frame-based animations with requestAnimationFrame while keeping code clean and predictable. It abstracts the boilerplate of setting up a render loop, tracking elapsed time, and updating callbacks at the right cadence. By providing a simple lifecycle—start, stop, tick—it encourages separation between state updates and rendering, which is essential for smooth visuals. The library aims to be unobtrusive: you can drop it into demos or prototypes...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    AudioCraft

    AudioCraft

    Audiocraft is a library for audio processing and generation

    AudioCraft is a PyTorch library for text-to-audio and text-to-music generation, packaging research models and tooling for training and inference. It includes MusicGen for music generation conditioned on text (and optionally melody) and AudioGen for text-conditioned sound effects and environmental audio. Both models operate over discrete audio tokens produced by a neural codec (EnCodec), which acts like a tokenizer for waveforms and enables efficient sequence modeling. The repo provides...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 14
    MusicPlayer2

    MusicPlayer2

    Audio player that can play common audio formats

    MusicPlayer2 is a simple music-player application (or prototype) implemented in — presumably — a web or desktop environment, intended to give users a clean, functional interface for managing and playing audio files. The project likely implements basic playlist management, playback controls (play, pause, skip), and possibly UI features to browse or organize music. Because many smaller music-player projects aim for simplicity, MusicPlayer2 may focus on providing a lightweight, minimal-dependency audio player compared to larger, heavy multimedia suites. ...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 15
    HLS.js

    HLS.js

    HLS.js is a JavaScript library that plays HLS in browsers

    HLS.js is a JavaScript library that implements an HTTP Live Streaming client. It relies on HTML5 video and MediaSource Extensions for playback. It works by transmuxing MPEG-2 Transport Stream and AAC/MP3 streams into ISO BMFF (MP4) fragments. Transmuxing is performed asynchronously using a Web Worker when available in the browser. HLS.js also supports HLS + fmp4, as announced during WWDC2016. HLS.js works directly on top of a standard HTML<video> element. HLS.js is written in ECMAScript6...
    Downloads: 19 This Week
    Last Update:
    See Project
  • 16
    HeartMuLa

    HeartMuLa

    A Family of Open Sourced Music Foundation Models

    ...The project also includes HeartCodec, a music codec optimized for high reconstruction fidelity, enabling efficient tokenization and reconstruction workflows that are critical for training and generation pipelines. For text extraction from audio, it provides HeartTranscriptor, a Whisper-based model tuned specifically for lyrics transcription, which helps bridge generated or recorded audio back into structured text. It also introduces HeartCLAP, which aligns audio and text into a shared embedding space.
    Downloads: 16 This Week
    Last Update:
    See Project
  • 17
    AudioNotes

    AudioNotes

    Extract audio and video content and organize it into a Markdown note

    ...As an open-source repository, AudioNotes provides developers or power users the opportunity to customize how audio is captured, stored, annotated, and replayed — e.g. adding playback speed control, export to standard formats, or synchronization between notes and audio timeline. It may support simple UI for starting/stopping recordings, writing or editing notes, and navigating through recorded sessions.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 18
    NovaSR

    NovaSR

    A lightning fast audio upsampler

    NovaSR is an extremely lightweight and high-performance audio upsampling model that transforms low-quality 16 kHz audio into clearer, high-fidelity 48 kHz audio with remarkable speed and efficiency. At only about 50 KB in size, the model is orders of magnitude smaller than typical audio super-resolution networks, yet it achieves high quality and realtime performance thanks to its compact architecture and efficient convolutional design. NovaSR is especially valuable for post-processing tasks...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 19
    Oboe

    Oboe

    Oboe is a C++ library that makes it easy to build high-performance

    oboe is a C++ library for building high-performance audio apps on Android, providing a unified, low-latency API over AAudio and OpenSL ES. It abstracts device and API-version differences so developers can focus on audio processing instead of platform quirks. The library emphasizes minimal latency and glitch-free playback/recording via tuned buffer strategies and callback-driven I/O. It supports features like floating-point audio, channel configuration, sample-rate negotiation, and stream...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 20
    Spring Integration

    Spring Integration

    Provides an extension of the Spring programming model

    Extends the Spring programming model to support the well-known Enterprise Integration Patterns. Spring Integration enables lightweight messaging within Spring-based applications and supports integration with external systems via declarative adapters. Those adapters provide a higher-level of abstraction over Spring’s support for remoting, messaging, and scheduling. Spring Integration’s primary goal is to provide a simple model for building enterprise integration solutions while maintaining...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 21
    VoxCPM2

    VoxCPM2

    Tokenizer-Free TTS for Multilingual Speech Generation

    ...The system is trained on massive multilingual datasets, enabling support for dozens of languages and dialects while maintaining high fidelity and realism in generated audio. VoxCPM stands out for its ability to perform voice cloning with minimal input, capturing not only the speaker’s timbre but also nuanced features such as rhythm, accent, and emotional delivery. It also introduces voice design capabilities, allowing users to generate entirely new voices from natural language descriptions without requiring reference audio.
    Downloads: 12 This Week
    Last Update:
    See Project
  • 22
    SFBAudioEngine

    SFBAudioEngine

    A powerhouse of audio functionality for macOS, iOS, and tvOS

    SFBAudioEngine is an advanced audio engine designed for macOS and iOS, focusing on high-quality playback, precise audio control, and support for a wide range of audio formats. Built for modern Apple platforms, it provides developers with a robust tool for integrating sophisticated audio functionalities into their applications. It emphasizes extensibility, performance, and clean API design.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    WhisperJAV

    WhisperJAV

    Uses Qwen3-ASR, local LLM, Whisper, TEN-VAD

    WhisperJAV is an open-source speech transcription pipeline designed specifically for generating subtitles for Japanese adult video content. The project addresses challenges that standard speech recognition models face when transcribing this type of audio, which often includes low signal-to-noise ratios and large numbers of non-verbal vocalizations. Traditional automatic speech recognition systems can misinterpret these sounds as words, leading to inaccurate transcripts. ...
    Downloads: 20 This Week
    Last Update:
    See Project
  • 24
    pyAudioAnalysis

    pyAudioAnalysis

    Python Audio Analysis Library: Feature Extraction, Classification

    pyAudioAnalysis is an open-source Python library designed for audio signal analysis, machine learning, and music information retrieval tasks. The project provides a collection of tools that allow developers to extract meaningful features from audio files and use those features for classification, segmentation, and analysis. The library supports multiple audio processing workflows, including feature extraction from raw audio signals, training of machine learning models, and automatic audio segmentation. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    EeveeSpotify

    EeveeSpotify

    A tweak to enhance Spotify experience

    EeveeSpotifyReborn is an unofficial modification for the Spotify mobile application that alters client-side behavior to unlock premium-like features without requiring a paid subscription. It operates by injecting changes into the Spotify app, making it interpret the user account as having premium access and enabling functionalities that are normally restricted. The project was developed through reverse engineering techniques, including analyzing application behavior and intercepting requests...
    Downloads: 57 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next
MongoDB Logo MongoDB