Search Results for "transcribe audio to srt"

Showing 54 open source projects for "transcribe audio to srt"

View related business solutions
  • The only CRM built for B2C Icon
    The only CRM built for B2C

    Stop chasing transactions. Klaviyo turns customers into diehard fans—obsessed with your products, devoted to your brand, fueling your growth.

    Klaviyo unifies your customer profiles by capturing every event, and then lets you orchestrate your email marketing, SMS marketing, push notifications, WhatsApp, and RCS campaigns in one place. Klaviyo AI helps you build audiences, write copy, and optimize — so you can always send the right message at the right time, automatically. With real-time attribution and insights, you'll be able to make smarter, faster decisions that drive ROI.
    Learn More
  • The #1 solution for profitable resource management Icon
    The #1 solution for profitable resource management

    Designed to give Operations and Finance leaders the insight and foresight they need to achieve profitable delivery at scale.

    Unlike spreadsheets or clunky PSAs, Float offers a clear, centralized view to schedule teams, plan capacity, estimate work, and track margins in real-time so that you can keep your people and profits on track.
    Learn More
  • 1
    FFsubsync

    FFsubsync

    Automagically synchronize subtitles with video

    ...Make sure ffmpeg is on your path and can be referenced from the command line! Next, grab the script. It should work with both Python 2 and Python 3. There may be occasions where you have a correctly synchronized srt file in a language you are unfamiliar with, as well as an unsynchronized srt file in your native language. In this case, you can use the correctly synchronized srt file directly as a reference for synchronization, instead of using the video as the reference. ffsubsync uses the file extension to decide whether to perform voice activity detection on the audio or to directly extract speech from an srt file. ffsubsync usually finishes in 20 to 30 seconds, depending on the length of the video.
    Downloads: 50 This Week
    Last Update:
    See Project
  • 2
    Transcripciones con Whisper Esta aplicación de escritorio basada en web permite transcribir (o transcribir y traducir al ingles), archivos de audio o video utilizando el modelo Whisper de OpenAI. Transcriptions with Whisper This web-based desktop application allows you to transcribe—or both transcribe and translate into English—audio or video files using OpenAI's Whisper model.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 3
    Whishper

    Whishper

    Transcribe any audio to text, translate and edit subtitles 100% locall

    Open-source, local-first audio transcription and subtitling suite with a simple web UI. Thanks to open-source technologies, Whishper can run 100% offline. Your data never leaves your computer. Whishper allows you to translate your transcriptions to and from more than 60 languages thanks to Argos Translate and LibreTranslate. Download the transcriptions in many formats (json, txt, vtt, srt).
    Downloads: 21 This Week
    Last Update:
    See Project
  • 4
    stt

    stt

    Voice Recognition to Text Tool

    stt is a standalone speech recognition tool that locally converts spoken content in audio or video files into textual formats without requiring internet access, giving users control over their data and reducing reliance on external APIs. It leverages open-source speech models such as Faster-Whisper to recognize and transcribe human speech into plain text, structured JSON objects, or subtitle files with time codes, making it suitable for both personal and professional transcription tasks. ...
    Downloads: 6 This Week
    Last Update:
    See Project
  • Unrivaled Embedded Payments Solutions | NMI Icon
    Unrivaled Embedded Payments Solutions | NMI

    For SaaS builders, software companies, ISVs and ISOs who want to embed payments into their tech stack

    NMI Payments is an embedded payments solution that lets SaaS platforms, Software companies and ISVs integrate, brand, and manage payment acceptance directly within their software—without becoming a PayFac or building complex infrastructure. As a full-stack processor, acquirer, and technology partner, NMI handles onboarding, compliance, and risk so you can stay focused on growth. The modular, white-label platform supports omnichannel payments, from online, mobile and in-app to in-store and unattended. Choose from full-code, low-code, or no-code integration paths and launch in weeks, not months. Built-in risk tools, flexible monetization, and customizable branding help you scale faster while keeping full control of your experience. With NMI’s developer-first tools, sandbox testing, and modern APIs, you can embed payments quickly and confidently.
    Learn More
  • 5
    Buzz

    Buzz

    Transcribe and translate audio offline on your personal computer

    Buzz transcribes and translates audio to text offline using OpenAI's Whisper. Import audio and video files into Buzz and export them as TXT, SRT, or VTT files. Buzz supports Whisper, Whisper.cpp, Faster Whisper, Whisper-compatible models from the Hugging Face repository, and the OpenAI Whisper API. Get linux versions from: - https://flathub.org/apps/io.github.chidiwilliams.Buzz - https://snapcraft.io/buzz Home page of Buzz https://github.com/chidiwilliams/buzz Note for Windows: App is not signed, you will get a warning when you install it. ...
    Leader badge
    Downloads: 4,891 This Week
    Last Update:
    See Project
  • 6
    VideoCaptioner

    VideoCaptioner

    AI-powered tool for generating, optimizing, and translating subtitles

    VideoCaptioner is an open source AI-powered subtitle processing tool designed to simplify the workflow of creating subtitles for videos. It integrates speech recognition, language processing, and translation technologies to automatically generate and refine subtitles from video or audio sources. VideoCaptioner uses speech-to-text engines such as Whisper variants to transcribe spoken content and convert it into subtitle text with accurate timestamps. After transcription, large language models are used to intelligently restructure subtitles into natural sentences, correct wording, and improve readability for viewers. ...
    Downloads: 18 This Week
    Last Update:
    See Project
  • 7
    edge-tts

    edge-tts

    Use Microsoft Edge's online text-to-speech service from Python

    ...It wraps the same cloud voices used by Edge, exposing them through a simple CLI (edge-tts, edge-playback) and a Python API, so you can script high-quality speech generation in your own applications. The tool lets you list available voices, specify locale and voice name, and generate audio files in common formats like MP3 or WAV. It also supports generating subtitle files (such as SRT or VTT) alongside the speech, which is handy for video narration, e-learning, or accessibility workflows. From the CLI you can adjust parameters such as speaking rate, volume, and pitch, giving you some control over prosody without diving into SSML. ...
    Downloads: 38 This Week
    Last Update:
    See Project
  • 8
    Handy STT

    Handy STT

    A free, open source, and extensible speech-to-text application

    Handy is a free, open-source, offline speech-to-text application built for privacy, accessibility, and extensibility. Developed using Tauri (Rust + React/TypeScript), it runs natively across Windows, macOS, and Linux while performing local speech recognition without sending any audio to cloud servers. Handy allows users to start transcription instantly using a configurable keyboard shortcut—press to record, release to transcribe—and automatically pastes the resulting text into any active text field. Its backend leverages OpenAI’s Whisper models for GPU-accelerated speech recognition and Parakeet V3 for efficient CPU-only transcription with automatic language detection. ...
    Downloads: 124 This Week
    Last Update:
    See Project
  • 9
    pyVideoTrans

    pyVideoTrans

    Translate the video from one language to another and embed dubbing

    pyVideoTrans is an ambitious open-source multimedia processing project that assembles speech recognition, subtitle generation, AI translation, voice synthesis, and video assembly into a unified pipeline for converting videos from one language to another with embedded dubbing and captions. At its core it runs speech-to-text models to transcribe audio tracks, translates the resulting text into a target language using local or cloud-based translation engines, synthesizes new speech to match the translated subtitles, and then merges that speech back into the video, creating a fully localized media file. The tool supports both command-line and GUI modes, making it accessible to developers and creatives needing batch or automated processing.
    Downloads: 30 This Week
    Last Update:
    See Project
  • Fully managed relational database service for MySQL, PostgreSQL, and SQL Server Icon
    Fully managed relational database service for MySQL, PostgreSQL, and SQL Server

    Focus on your application, and leave the database to us

    Cloud SQL manages your databases so you don't have to, so your business can run without disruption. It automates all your backups, replication, patches, encryption, and storage capacity increases to give your applications the reliability, scalability, and security they need.
    Try for free
  • 10
    Translate-Subtitle-File

    Translate-Subtitle-File

    Subtitle Creation Assistant

    Subtitle group machine translation assistant - [Function 1: Translate subtitle file] .srt .ass .vtt [Function 2: Voice to text] (Drag in video or audio to recognize subtitles) (The latest version v4.1.0 Update time 2021 2 May 23) 12 translation service providers can be configured, such as Google, Baidu, Tencent, Caiyun, IBM, Azure, Amazon, etc. (6 voice service providers can be configured: Alibaba Cloud, Xunfei, Tencent Cloud, IBM, Azure, Amazon ) Advantages: 1.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 11
    Whisper-WebUI

    Whisper-WebUI

    A Web UI for easy subtitle using whisper model

    Whisper WebUI is an open-source browser-based interface that simplifies the use of Whisper speech recognition models by providing an intuitive graphical environment for transcription, translation, and subtitle generation. Built with Gradio, it allows users to upload audio or video files, process them locally, and generate accurate text outputs without relying on command-line tools. The platform integrates optimized implementations such as faster-whisper, significantly improving transcription...
    Downloads: 23 This Week
    Last Update:
    See Project
  • 12
    BasedHardware

    BasedHardware

    Open source AI wearable platform for recording and summarizing speech

    ...It combines hardware, firmware, mobile applications, and backend services to create a complete ecosystem for voice-driven interaction. Users can connect the wearable device to a mobile phone and automatically record and transcribe meetings, conversations, and voice memos. Omi includes firmware for wearable hardware, a Flutter-based mobile companion application, backend services built with Python and FastAPI, and various SDKs for developers. These components work together to process audio, perform speech recognition, and integrate AI features such as summaries and automated actions. ...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 13
    Groq TypeScript / Node.s

    Groq TypeScript / Node.s

    The official Node.js / Typescript library for the Groq API

    Groq TypeScript / Node.s (also often referred to as “groq-sdk” on npm) is the official Node.js / TypeScript client library for Groq’s REST API, enabling JavaScript/TypeScript developers to integrate LLM and AI-powered services into web backends, serverless functions, or frontend apps. It exports strongly-typed interfaces for models, chat completions, file uploads (e.g. for audio transcription), and other endpoints, allowing for better type safety and developer experience when using Groq from TypeScript. The library also supports passing different input types (file streams, blobs, fetch responses) for media-related endpoints, making it flexible for diverse environments (backend, browser, serverless). With this SDK, developers can call Groq’s models, transcribe audio, perform file uploads — all with minimal boilerplate — which streamlines creation of AI-enabled applications in the JavaScript/TypeScript ecosystem.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 14
    AutoSubs

    AutoSubs

    Instantly generate AI-powered subtitles on your device

    AutoSubs is an open-source, AI-powered subtitle generation tool that enables users to automatically transcribe audio and video content into accurate, editable subtitles directly on their device. It supports both standalone usage and integration with professional video editing software such as DaVinci Resolve, allowing creators to generate and edit subtitles within their existing workflows. The tool leverages speech-to-text models, including OpenAI Whisper, to produce high-quality transcriptions and can differentiate between speakers using diarization techniques. ...
    Downloads: 15 This Week
    Last Update:
    See Project
  • 15
    ChatGPT Telegram Bot

    ChatGPT Telegram Bot

    A Telegram bot that integrates with OpenAI's official ChatGPT APIs

    A Telegram bot that integrates with OpenAI's official ChatGPT, DALL·E and Whisper APIs to provide answers. Ready to use with minimal configuration required.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 16
    Insanely Fast Whisper

    Insanely Fast Whisper

    An opinionated CLI to transcribe Audio files w/ Whisper on-device

    Insanely Fast Whisper is a high-performance command-line tool designed to dramatically accelerate speech-to-text transcription using OpenAI’s Whisper models on local hardware. It leverages modern optimizations such as batch processing, mixed precision, and advanced attention mechanisms like Flash Attention to significantly reduce inference time while maintaining high transcription accuracy. The project is built on top of the Transformers ecosystem and integrates with libraries such as...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Amical

    Amical

    Open Source AI Dictation App

    Amical is an open source, AI-powered desktop dictation and note-taking application that enables users to dictate hands-free, transcribe meetings, and capture notes effortlessly with unmatched speed, accuracy, and privacy. It leverages both local and cloud-based AI models, letting users seamlessly switch between providers for the ideal balance of speed, precision, and control, and understands the context of each app in use to automatically format text in a tone and style appropriate to the...
    Downloads: 13 This Week
    Last Update:
    See Project
  • 18
    Auto Synced & Translated Dubs

    Auto Synced & Translated Dubs

    Automatically translates the text of a video based on a subtitle file

    Auto-Synced-Translated-Dubs is a toolchain that automatically translates and re-dubs videos using AI voices while keeping the new speech aligned to the original timing via subtitle files. It assumes you have a human-made SRT (or similar) subtitle file; the script then uses translation services such as Google Cloud or DeepL to generate translated subtitle tracks in one or more target languages. Using the timestamps of each subtitle line, it computes the required duration of each spoken segment and synthesizes audio via neural TTS services, producing one audio clip per subtitle entry. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Bootleg Text Slicer

    Bootleg Text Slicer

    Text transcription & slicing tool with visual timeline and WAV output.

    - Transcribe an audio file into individual words. - Display and interact with each word’s start and end positions on a timeline or within the "Review Dashboard." - Adjust timing offsets for the beginning and end of each word either globally or individually. - Play full audio or specific words directly from within the app. - Export words as separate `.wav` audio files
    Downloads: 1 This Week
    Last Update:
    See Project
  • 20

    SoundTranscriber

    SoundTranscriber can be used to generate automatic transcription / aut

    SoundTranscriber can be used to generate automatic transcription / aut
    Downloads: 4 This Week
    Last Update:
    See Project
  • 21
    Whisper Batch Transcriber

    Whisper Batch Transcriber

    Unlimited, private and free Speech-To-Text program

    ## About: Automatically transcribe all of your voice recordings into clean, organized, neat text files. It's free, fully automated, unlimited, using state-of-the-art speech-to-text technology. Works 100% offline on your computer, privately and locally. ## Usecases: Convert speeches, podcasts, webinars, monologues, storytellings and other audio speech into a formatted .txt file.
    Downloads: 14 This Week
    Last Update:
    See Project
  • 22
    blackvideo-mini-player

    blackvideo-mini-player

    A standalone lightweight auxiliary CLI video player for BlackVideo.

    Lightweight cross-platform video player (Ada + SDL2 + FFmpeg). Support player for the BlackVideo. Works standalone via CLI or right-click on any video file. Usage Method 1 — Command Line Step 1. Unzip blackvideo-mini-player-v2.3.0.win.zip Step 2. Open the build\ folder, then type cmd directly in the address bar and press Enter — this opens a terminal already in that folder. Alternatively: open Command Prompt anywhere and use cd with the copied path: cd...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 23
    Quick Subtitles

    Quick Subtitles

    HTML5 Based Subtitle Creation Tool

    Quick Subtitles in an HTML5 based solution for rapid creation and syncing of subtitles while playing your video. It is designed around the concept that you should minimize the need to take your hands off the keyboard while performing data entry.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 24
    VATSG

    VATSG

    Video automatic transcribe and translated subtitle generator

    It generates srt format subtitle from videofile which can be any source language that whisper support , and then make translated subtitle file of your target language which deepl support. This is the subtitle generator(VATSG) which use [moviepy](https://github.com/Zulko/moviepy) to generate mp3 and then use [faster-whisper](https://github.com/guillaumekln/faster-whisper) to get text recognition and then use deepl-api to generate your target language subtitle file(srt format) If you...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 25
    OpenAI Web Application

    OpenAI Web Application

    A web application that allows users to interact with OpenAI's models

    ...Take advantage of DALL·E models to generate AI images. Utilize Whisper Model to transcribe audio into text.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • Next
MongoDB Logo MongoDB