Join/Login
Business Software
Open Source Software
For Vendors
Blog
About
More

For Vendors Help Create Join Login

Business Software

Open Source Software

SourceForge Podcast

Resources

Articles
Case Studies
Blog

Menu

Help
Create
Join
Login

Home
Open Source Software
Search Results

Search Results for "speech recognition machine learning"

x

Sort By:

Relevance

Clear All Filters

OS

BSD 63
Linux 63
Windows 61
More...
Mac 56
ChromeOS 53
Desktop Operating Systems 4
Mobile Operating Systems 3
Server Operating Systems 2
Game Consoles 1

Category

Artificial Intelligence 55
Scientific/Engineering 15
Software Development 12
Multimedia 10
Education 2
Communications 1
Internet 1
System 1

License

OSI-Approved Open Source 53
Other License 1
Public Domain 1

Translations

English 9
Spanish 2
French 1
German 1
More...
Serbian 1

Programming Language

Python 21
C++ 13
Java 11
C 6
More...
MATLAB 3
Go 2
C# 1
JavaScript 1
JSP 1
Perl 1
PHP 1
Scilab 1
Simulink 1

Status

Production/Stable 8
Beta 6
Alpha 4
Planning 3
More...
Pre-Alpha 3

63 projects for "speech recognition machine learning" with 1 filter applied:

BSD Clear Filters & Widen Search

Get full visibility and control over your tasks and projects with Wrike.
A cloud-based collaboration, work management, and project management software

Wrike offers world-class features that empower cross-functional, distributed, or growing teams take their projects from the initial request stage all the way to tracking work progress and reporting results.

Learn More
Create stunning, professional email signatures in minutes
For companies looking to create, assign and manage all their employees email signatures and add targeted marketing banners.

Create, assign and manage all your employees’ email signatures and add targeted marketing banners. Stop getting worked up about your signatures! Leverage a centralized interface to easily create and manage the email signatures of all your employees. Take advantage of each email to broadcast and amplify your brand. Letsignit helps you regain control over your digital identity. Harmonize 100% of your employee’s email signatures in just a few clicks! 121 professional emails are received and 40 are sent every day by an employee. With Letsignit, turn every email into a powerful communication opportunity: send the right message to the right person at the right time! Innovative more than tech, inspiring more than following. Authentic more than overrated, close more than "think big", trustworthy more than doubtful. Hands-on more than complex, available but yet premium, fun but yet expert.

Learn More
1

Hugging Face - Speech To Speech

Open speech-to-speech models and pipelines by Hugging Face toolkit AI

This project from Hugging Face focuses on enabling direct speech-to-speech processing using modern machine learning models. It provides tools and reference implementations that allow audio input to be transformed into audio output without requiring an intermediate text representation. Hugging Face - Speech To Speech builds on recent advances in speech modeling, combining components such as speech recognition, translation, and synthesis into unified pipelines. ...

Downloads: 2 This Week

Last Update: 2026-03-18
See Project
2

Interactive Machine Learning Experiments

Interactive Machine Learning experiments

Interactive Machine Learning Experiments is a collection of interactive demonstrations that showcase how various machine learning models can be trained and used in real applications. The project combines Jupyter or Colab notebooks with browser-based visual demos that allow users to see trained models operating in real time. Many experiments involve tasks such as image classification, object detection, gesture recognition, and simple generative models. ...

Downloads: 0 This Week

Last Update: 2026-03-12
See Project
3

pyAudioAnalysis

Python Audio Analysis Library: Feature Extraction, Classification

...It also includes utilities for visualizing audio features and analyzing patterns within sound recordings, which can be useful in applications such as speech recognition, music classification, and acoustic event detection. Because the library integrates machine learning algorithms with signal processing tools, it enables researchers to develop complete audio analysis pipelines using a single framework.

Downloads: 0 This Week

Last Update: 2026-03-10
See Project
4

whisper.cpp

Port of OpenAI's Whisper model in C/C++

whisper.cpp is a lightweight, C/C++ reimplementation of OpenAI’s Whisper automatic speech recognition (ASR) model—designed for efficient, standalone transcription without external dependencies. The entire high-level implementation of the model is contained in whisper.h and whisper.cpp. The rest of the code is part of the ggml machine learning library. The command downloads the base.en model converted to custom ggml format and runs the inference on all .wav samples in the folder samples. whisper.cpp supports integer quantization of the Whisper ggml models. ...

Downloads: 358 This Week

Last Update: 2026-03-19
See Project
Intelligent testing agents | Checksum.ai
Checksum generates, runs, and maintains end-to-end tests automatically so your team ships with confidence as code output grows.

Coding agents write the code. Checksum runs it—continuously testing against real APIs, real data, real edge cases—before it ever reaches production.

Learn More
5

SimpleHTR

Handwritten Text Recognition (HTR) system implemented with TensorFlow

SimpleHTR is an open-source implementation of a handwriting text recognition system based on deep learning techniques. The project focuses on converting images of handwritten text into machine-readable digital text using neural networks. The system uses a combination of convolutional neural networks and recurrent neural networks to extract visual features and model sequential character patterns in handwriting.

Downloads: 0 This Week

Last Update: 2026-03-12
See Project
6

SCAIL

Towards Studio-Grade Character Animation via In-Context Learning of 3D

...While specific documentation about SCAIL’s exact goals and implementation is limited from the repository context alone, the project appears to be part of a collection of machine learning and AI research tools that facilitate scalable model development, evaluation, or application workflows. Given its listing alongside other ZAI projects like speech recognition and text-to-speech systems, SCAIL likely emphasizes scalable, composable AI learning frameworks that support researchers and practitioners in experimenting with learning algorithms, datasets, and model components. ...

Downloads: 0 This Week

Last Update: 2026-01-30
See Project
7

face.evoLVe

High-Performance Face Recognition Library on PaddlePaddle & PyTorch

face.evoLVe is a high-performance face recognition library designed for research and real-world applications in computer vision. The project provides a comprehensive framework for building and training modern face recognition models using deep learning architectures. It includes components for face alignment, landmark localization, data preprocessing, and model training pipelines that allow developers to construct end-to-end facial recognition systems. The repository supports multiple neural...

Downloads: 0 This Week

Last Update: 2026-03-11
See Project
8

Advanced NLP with spaCy

Advanced NLP with spaCy: A free online course

Advanced NLP with spaCy is an open-source educational repository that provides the materials for an interactive course on advanced natural language processing using the spaCy library. The course is designed to teach developers how to build real-world NLP systems by combining rule-based techniques with machine learning models. The repository includes lessons, exercises, and examples that guide learners through tasks such as tokenization, named entity recognition, text classification, and training custom NLP models. It also demonstrates how spaCy pipelines work and how developers can extend them with custom components and training data. ...

Downloads: 0 This Week

Last Update: 2026-03-12
See Project
9

MediaPipe Solutions

Cross-platform, customizable ML solutions

MediaPipe is an open-source framework developed by Google for building cross-platform machine learning pipelines that process audio, video, and other streaming data in real time. The system provides developers with tools and reusable components that allow them to combine multiple machine learning models with preprocessing and postprocessing logic into efficient perception pipelines. These pipelines can run on a wide variety of platforms including mobile devices, desktop systems, web browsers, and embedded edge devices. ...

Downloads: 1 This Week

Last Update: 2026-03-15
See Project
Agentic AI SRE built for Engineering and DevOps teams.
No More Time Lost to Troubleshooting

NeuBird AI's agentic AI SRE delivers autonomous incident resolution, helping team cut MTTR up to 90% and reclaim engineering hours lost to troubleshooting.

Learn More
10

ESPnet

End-to-end speech processing toolkit

ESPnet is a comprehensive end-to-end speech processing toolkit covering a wide spectrum of tasks, including automatic speech recognition (ASR), text-to-speech (TTS), speech translation (ST), speech enhancement, speaker diarization, and spoken language understanding. It uses PyTorch as its deep learning engine and adopts a Kaldi-style data processing pipeline for features, data formats, and experimental recipes.

Downloads: 1 This Week

Last Update: 2026-04-07
See Project
11

Open Model Zoo

Pre-trained Deep Learning models and demos

Open Model Zoo is a large repository of high-quality pre-trained deep learning models and demonstration applications designed to work with the OpenVINO™ toolkit, offering a comprehensive starting point for a wide range of AI and computer vision workloads. It includes hundreds of models covering object detection, classification, segmentation, pose estimation, speech recognition, text-to-speech, and more, many of which are already converted into formats optimized for inference on CPUs, GPUs, VPUs, and other accelerators supported by OpenVINO. ...

Downloads: 1 This Week

Last Update: 2026-01-10
See Project
12

LLPlayer

The media player for language learning, with dual subtitles

LLPlayer is an open-source media player designed specifically for language learning through video content. Unlike traditional media players, the application focuses on advanced subtitle-related features that help learners understand and interact with foreign language media more effectively. The player supports dual subtitles so users can simultaneously view text in both the original language and their native language while watching videos. It can also automatically generate subtitles in real...

Downloads: 42 This Week

Last Update: 2026-03-05
See Project
13

Lingvo

Framework for building neural networks

...It has been used to implement state of the art architectures such as recurrent neural networks, Transformer models, variational autoencoder hybrids, and multi task systems. Lingvo includes reference models and configurations for domains like machine translation, automatic speech recognition, language modeling, image understanding, and 3D object detection. Centralized hyperparameter configuration files allow researchers to share exact experiment setups so others can retrain and compare results reliably.

Downloads: 0 This Week

Last Update: 2025-11-28
See Project
14

Text2Code for Jupyter notebook

A proof-of-concept jupyter extension which converts english queries

Text2Code for Jupyter notebook project is a proof-of-concept extension for Jupyter Notebook that allows users to generate Python code directly from natural language queries written in English. The tool is designed to simplify data analysis workflows by enabling users to describe their intended operation in plain language instead of manually writing code. When a user enters a textual command, the extension interprets the request and generates a corresponding Python code snippet that can be...

Downloads: 0 This Week

Last Update: 2026-03-12
See Project
15

docext

An on-premises, OCR-free unstructured data extraction

docext is a document intelligence toolkit that uses vision-language models to extract structured information from documents such as PDFs, forms, and scanned images. The system is designed to operate entirely on-premises, allowing organizations to process sensitive documents without relying on external cloud services. Unlike traditional document processing pipelines that rely heavily on optical character recognition, docext leverages multimodal AI models capable of understanding both visual...

Downloads: 2 This Week

Last Update: 2026-03-12
See Project
16

Scriberr

Self-hosted AI audio transcription

Scriberr is a self-hosted AI-powered transcription platform designed to convert audio and video into highly accurate text while prioritizing privacy and local processing. Unlike cloud-based transcription services, Scriberr runs entirely on the user’s machine, ensuring that sensitive recordings are never sent to third-party servers and remain fully under user control. It leverages modern speech recognition models such as Whisper and other advanced architectures to deliver precise transcripts with word-level timing and speaker identification. The application includes a polished user interface that simplifies the management of recordings, transcripts, and annotations, making it suitable for both casual users and professionals handling large volumes of audio. ...

Downloads: 12 This Week

Last Update: 2026-03-19
See Project
17

PRML

PRML algorithms implemented in Python

PRML repository is a respected and well-maintained project that implements the foundational algorithms from the famous textbook Pattern Recognition and Machine Learning by Christopher M. Bishop, providing a practical and accessible Python reference for both students and professionals. Rather than just summarizing concepts, the repository includes working code that demonstrates linear regression and classification, kernel methods, neural networks, graphical models, mixture models with EM algorithms, approximate inference, and sequential data methods — all following the book’s structure and notation. ...

Downloads: 0 This Week

Last Update: 2026-02-16
See Project
18

Armadillo

fast C++ library for linear algebra & scientific computing

* Fast C++ library for linear algebra (matrix maths) and scientific computing * Easy to use functions and syntax, deliberately similar to Matlab / Octave * Uses template meta-programming techniques to increase efficiency * Provides user-friendly wrappers for OpenBLAS, Intel MKL, LAPACK, ATLAS, ARPACK, SuperLU and FFTW libraries * Useful for machine learning, pattern recognition, signal processing, bioinformatics, statistics, finance, etc. * Downloads: http://arma.sourceforge.net/download.html * Documentation: http://arma.sourceforge.net/docs.html * Bug reports: http://arma.sourceforge.net/faq.html * Git repo: https://gitlab.com/conradsnicta/armadillo-code

Downloads: 2,737 This Week

Last Update: 2 days ago
See Project
19

ARC-AGI

The Abstraction and Reasoning Corpus

...The dataset is structured as grid-based puzzles, where each task requires understanding transformations such as symmetry, counting, or spatial manipulation. Unlike traditional machine learning benchmarks, ARC emphasizes generalization and reasoning over statistical pattern recognition, making it particularly challenging for current AI systems. The repository also includes a browser-based interface that allows humans to attempt solving the tasks manually, providing a baseline for comparison.

Downloads: 1 This Week

Last Update: 2026-04-03
See Project
20

KubeEdge

Kubernetes Native Edge Computing Framework (project under CNCF)

...It consists of a cloud part and an edge part, and provides core infrastructure support for networking, application deployment, and metadata synchronization between the cloud and edge. It also supports MQTT which enables edge devices to access through edge nodes. With KubeEdge it is easy to get and deploy existing complicated machine learning, image recognition, event processing, and other high-level applications to the Edge. With business logic running at the Edge, much larger volumes of data can be secured & processed locally where the data is produced. With data processed at the Edge, the responsiveness is increased dramatically and data privacy is protected.

Downloads: 0 This Week

Last Update: 2026-03-11
See Project
21

Bandicoot

fast C++ library for GPU linear algebra & scientific computing

* Fast GPU linear algebra library (matrix maths) for the C++ language, aiming towards a good balance between speed and ease of use * Provides high-level syntax and functionality deliberately similar to Matlab * Provides an API that is aiming to be compatible with Armadillo for easy transition between CPU and GPU linear algebra code * Useful for algorithm development directly in C++, or quick conversion of research code into production environments * Distributed under the permissive Apache 2.0 license, useful for both open-source and proprietary (closed-source) software * Can be used for machine learning, pattern recognition, computer vision, signal processing, bioinformatics, statistics, finance, etc * Downloads: http://coot.sourceforge.io/download.html * Documentation: http://coot.sourceforge.io/docs.html * Bug reports: http://coot.sourceforge.io/faq.html * Git repo: https://gitlab.com/conradsnicta/bandicoot-code

Downloads: 5 This Week

Last Update: 2025-12-10
See Project
22

MediaPipe Face Detection

Detect faces in an image

The MediaPipe Face Detection model is a high-performance, real-time face detection solution that uses machine learning to identify faces in images and video streams. It is optimized for mobile and embedded platforms, offering fast and accurate face detection while maintaining a small memory footprint. This model supports multiple face detections and is highly efficient, making it suitable for a variety of applications such as augmented reality, user authentication, and facial expression analysis.

Downloads: 4 This Week

Last Update: 2025-03-19
See Project
23

pattern_classification

A collection of tutorials and examples for solving machine learning

The pattern_classification repository is an educational project that provides tutorials, examples, and reference materials related to machine learning and statistical pattern recognition. The project aims to help learners understand the process of building predictive models by presenting structured explanations and practical examples. It includes notebooks and guides that demonstrate data preprocessing, feature extraction, model training, and evaluation techniques used in machine learning workflows. ...

Downloads: 0 This Week

Last Update: 2026-03-11
See Project
24

Audio AI Timeline

A timeline of the latest AI models for audio generation

...Rather than functioning as a model training framework, it serves as an informational resource that maps key papers, systems, models, datasets, and milestones across areas such as speech synthesis, music generation, audio understanding, source separation, and general audio machine learning. The project helps users understand how major techniques and ideas evolved over time, making it especially useful for researchers, students, and practitioners who want a broad overview of the field without digging through scattered references. ...

Downloads: 0 This Week

Last Update: 2026-03-12
See Project
25

Paper-with-Code-of-Wireless-comm

Paper-with-Code-of-Wireless-communication-Based-on-DL

Paper-with-Code-of-Wireless-communication-Based-on-DL is a curated repository that collects research papers and corresponding code implementations related to the application of deep learning in wireless communication systems. The project aims to help researchers and graduate students quickly find reproducible implementations of algorithms used in modern communication research. Wireless communication research has increasingly adopted deep learning techniques to address complex tasks such as...

Downloads: 0 This Week

Last Update: 2026-03-12
See Project

Previous
You're on page 1
2
3
Next

Related Searches

whisper-windows-x64.exe

whisper

whisper.cpp

whisper-bin-x64.zip

forensic audio analysis

whisper-cli.exe

tamil speech recognition

armadillo

pattern recognition

mediapipe

Related Categories

Artificial Intelligence

Scientific/Engineering

Software Development

Multimedia

Education

SourceForge

Create a Project
Open Source Software
Business Software
Top Downloaded Projects

Company

About
Team
SourceForge Headquarters
1320 Columbia Street Suite 310
San Diego, CA 92101
+1 (858) 422-6466

Resources

Support
Site Documentation
Site Status
SourceForge Reviews

© 2026 Slashdot Media. All Rights Reserved.

Terms Privacy Privacy Choices Advertise