Search Results for "speech recognition machine learning"

63 projects for "speech recognition machine learning" with 1 filter applied:

  • Get full visibility and control over your tasks and projects with Wrike. Icon
    Get full visibility and control over your tasks and projects with Wrike.

    A cloud-based collaboration, work management, and project management software

    Wrike offers world-class features that empower cross-functional, distributed, or growing teams take their projects from the initial request stage all the way to tracking work progress and reporting results.
    Learn More
  • Create stunning, professional email signatures in minutes Icon
    Create stunning, professional email signatures in minutes

    For companies looking to create, assign and manage all their employees email signatures and add targeted marketing banners.

    Create, assign and manage all your employees’ email signatures and add targeted marketing banners. Stop getting worked up about your signatures! Leverage a centralized interface to easily create and manage the email signatures of all your employees. Take advantage of each email to broadcast and amplify your brand. Letsignit helps you regain control over your digital identity. Harmonize 100% of your employee’s email signatures in just a few clicks! 121 professional emails are received and 40 are sent every day by an employee. With Letsignit, turn every email into a powerful communication opportunity: send the right message to the right person at the right time! Innovative more than tech, inspiring more than following. Authentic more than overrated, close more than "think big", trustworthy more than doubtful. Hands-on more than complex, available but yet premium, fun but yet expert.
    Learn More
  • 1
    Hugging Face - Speech To Speech

    Hugging Face - Speech To Speech

    Open speech-to-speech models and pipelines by Hugging Face toolkit AI

    This project from Hugging Face focuses on enabling direct speech-to-speech processing using modern machine learning models. It provides tools and reference implementations that allow audio input to be transformed into audio output without requiring an intermediate text representation. Hugging Face - Speech To Speech builds on recent advances in speech modeling, combining components such as speech recognition, translation, and synthesis into unified pipelines. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 2
    Interactive Machine Learning Experiments

    Interactive Machine Learning Experiments

    Interactive Machine Learning experiments

    Interactive Machine Learning Experiments is a collection of interactive demonstrations that showcase how various machine learning models can be trained and used in real applications. The project combines Jupyter or Colab notebooks with browser-based visual demos that allow users to see trained models operating in real time. Many experiments involve tasks such as image classification, object detection, gesture recognition, and simple generative models. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    pyAudioAnalysis

    pyAudioAnalysis

    Python Audio Analysis Library: Feature Extraction, Classification

    ...It also includes utilities for visualizing audio features and analyzing patterns within sound recordings, which can be useful in applications such as speech recognition, music classification, and acoustic event detection. Because the library integrates machine learning algorithms with signal processing tools, it enables researchers to develop complete audio analysis pipelines using a single framework.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    whisper.cpp

    whisper.cpp

    Port of OpenAI's Whisper model in C/C++

    whisper.cpp is a lightweight, C/C++ reimplementation of OpenAI’s Whisper automatic speech recognition (ASR) model—designed for efficient, standalone transcription without external dependencies. The entire high-level implementation of the model is contained in whisper.h and whisper.cpp. The rest of the code is part of the ggml machine learning library. The command downloads the base.en model converted to custom ggml format and runs the inference on all .wav samples in the folder samples. whisper.cpp supports integer quantization of the Whisper ggml models. ...
    Downloads: 358 This Week
    Last Update:
    See Project
  • Intelligent testing agents | Checksum.ai Icon
    Intelligent testing agents | Checksum.ai

    Checksum generates, runs, and maintains end-to-end tests automatically so your team ships with confidence as code output grows.

    Coding agents write the code. Checksum runs it—continuously testing against real APIs, real data, real edge cases—before it ever reaches production.
    Learn More
  • 5
    SimpleHTR

    SimpleHTR

    Handwritten Text Recognition (HTR) system implemented with TensorFlow

    SimpleHTR is an open-source implementation of a handwriting text recognition system based on deep learning techniques. The project focuses on converting images of handwritten text into machine-readable digital text using neural networks. The system uses a combination of convolutional neural networks and recurrent neural networks to extract visual features and model sequential character patterns in handwriting.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    SCAIL

    SCAIL

    Towards Studio-Grade Character Animation via In-Context Learning of 3D

    ...While specific documentation about SCAIL’s exact goals and implementation is limited from the repository context alone, the project appears to be part of a collection of machine learning and AI research tools that facilitate scalable model development, evaluation, or application workflows. Given its listing alongside other ZAI projects like speech recognition and text-to-speech systems, SCAIL likely emphasizes scalable, composable AI learning frameworks that support researchers and practitioners in experimenting with learning algorithms, datasets, and model components. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    face.evoLVe

    face.evoLVe

    High-Performance Face Recognition Library on PaddlePaddle & PyTorch

    face.evoLVe is a high-performance face recognition library designed for research and real-world applications in computer vision. The project provides a comprehensive framework for building and training modern face recognition models using deep learning architectures. It includes components for face alignment, landmark localization, data preprocessing, and model training pipelines that allow developers to construct end-to-end facial recognition systems. The repository supports multiple neural...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Advanced NLP with spaCy

    Advanced NLP with spaCy

    Advanced NLP with spaCy: A free online course

    Advanced NLP with spaCy is an open-source educational repository that provides the materials for an interactive course on advanced natural language processing using the spaCy library. The course is designed to teach developers how to build real-world NLP systems by combining rule-based techniques with machine learning models. The repository includes lessons, exercises, and examples that guide learners through tasks such as tokenization, named entity recognition, text classification, and training custom NLP models. It also demonstrates how spaCy pipelines work and how developers can extend them with custom components and training data. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    MediaPipe Solutions

    MediaPipe Solutions

    Cross-platform, customizable ML solutions

    MediaPipe is an open-source framework developed by Google for building cross-platform machine learning pipelines that process audio, video, and other streaming data in real time. The system provides developers with tools and reusable components that allow them to combine multiple machine learning models with preprocessing and postprocessing logic into efficient perception pipelines. These pipelines can run on a wide variety of platforms including mobile devices, desktop systems, web browsers, and embedded edge devices. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • Agentic AI SRE built for Engineering and DevOps teams. Icon
    Agentic AI SRE built for Engineering and DevOps teams.

    No More Time Lost to Troubleshooting

    NeuBird AI's agentic AI SRE delivers autonomous incident resolution, helping team cut MTTR up to 90% and reclaim engineering hours lost to troubleshooting.
    Learn More
  • 10
    ESPnet

    ESPnet

    End-to-end speech processing toolkit

    ESPnet is a comprehensive end-to-end speech processing toolkit covering a wide spectrum of tasks, including automatic speech recognition (ASR), text-to-speech (TTS), speech translation (ST), speech enhancement, speaker diarization, and spoken language understanding. It uses PyTorch as its deep learning engine and adopts a Kaldi-style data processing pipeline for features, data formats, and experimental recipes.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 11
    Open Model Zoo

    Open Model Zoo

    Pre-trained Deep Learning models and demos

    Open Model Zoo is a large repository of high-quality pre-trained deep learning models and demonstration applications designed to work with the OpenVINO™ toolkit, offering a comprehensive starting point for a wide range of AI and computer vision workloads. It includes hundreds of models covering object detection, classification, segmentation, pose estimation, speech recognition, text-to-speech, and more, many of which are already converted into formats optimized for inference on CPUs, GPUs, VPUs, and other accelerators supported by OpenVINO. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 12
    LLPlayer

    LLPlayer

    The media player for language learning, with dual subtitles

    LLPlayer is an open-source media player designed specifically for language learning through video content. Unlike traditional media players, the application focuses on advanced subtitle-related features that help learners understand and interact with foreign language media more effectively. The player supports dual subtitles so users can simultaneously view text in both the original language and their native language while watching videos. It can also automatically generate subtitles in real...
    Downloads: 42 This Week
    Last Update:
    See Project
  • 13
    Lingvo

    Lingvo

    Framework for building neural networks

    ...It has been used to implement state of the art architectures such as recurrent neural networks, Transformer models, variational autoencoder hybrids, and multi task systems. Lingvo includes reference models and configurations for domains like machine translation, automatic speech recognition, language modeling, image understanding, and 3D object detection. Centralized hyperparameter configuration files allow researchers to share exact experiment setups so others can retrain and compare results reliably.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Text2Code for Jupyter notebook

    Text2Code for Jupyter notebook

    A proof-of-concept jupyter extension which converts english queries

    Text2Code for Jupyter notebook project is a proof-of-concept extension for Jupyter Notebook that allows users to generate Python code directly from natural language queries written in English. The tool is designed to simplify data analysis workflows by enabling users to describe their intended operation in plain language instead of manually writing code. When a user enters a textual command, the extension interprets the request and generates a corresponding Python code snippet that can be...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    docext

    docext

    An on-premises, OCR-free unstructured data extraction

    docext is a document intelligence toolkit that uses vision-language models to extract structured information from documents such as PDFs, forms, and scanned images. The system is designed to operate entirely on-premises, allowing organizations to process sensitive documents without relying on external cloud services. Unlike traditional document processing pipelines that rely heavily on optical character recognition, docext leverages multimodal AI models capable of understanding both visual...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 16
    Scriberr

    Scriberr

    Self-hosted AI audio transcription

    Scriberr is a self-hosted AI-powered transcription platform designed to convert audio and video into highly accurate text while prioritizing privacy and local processing. Unlike cloud-based transcription services, Scriberr runs entirely on the user’s machine, ensuring that sensitive recordings are never sent to third-party servers and remain fully under user control. It leverages modern speech recognition models such as Whisper and other advanced architectures to deliver precise transcripts with word-level timing and speaker identification. The application includes a polished user interface that simplifies the management of recordings, transcripts, and annotations, making it suitable for both casual users and professionals handling large volumes of audio. ...
    Downloads: 12 This Week
    Last Update:
    See Project
  • 17
    PRML

    PRML

    PRML algorithms implemented in Python

    PRML repository is a respected and well-maintained project that implements the foundational algorithms from the famous textbook Pattern Recognition and Machine Learning by Christopher M. Bishop, providing a practical and accessible Python reference for both students and professionals. Rather than just summarizing concepts, the repository includes working code that demonstrates linear regression and classification, kernel methods, neural networks, graphical models, mixture models with EM algorithms, approximate inference, and sequential data methods — all following the book’s structure and notation. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Armadillo

    Armadillo

    fast C++ library for linear algebra & scientific computing

    * Fast C++ library for linear algebra (matrix maths) and scientific computing * Easy to use functions and syntax, deliberately similar to Matlab / Octave * Uses template meta-programming techniques to increase efficiency * Provides user-friendly wrappers for OpenBLAS, Intel MKL, LAPACK, ATLAS, ARPACK, SuperLU and FFTW libraries * Useful for machine learning, pattern recognition, signal processing, bioinformatics, statistics, finance, etc. * Downloads: http://arma.sourceforge.net/download.html * Documentation: http://arma.sourceforge.net/docs.html * Bug reports: http://arma.sourceforge.net/faq.html * Git repo: https://gitlab.com/conradsnicta/armadillo-code
    Leader badge
    Downloads: 2,737 This Week
    Last Update:
    See Project
  • 19
    ARC-AGI

    ARC-AGI

    The Abstraction and Reasoning Corpus

    ...The dataset is structured as grid-based puzzles, where each task requires understanding transformations such as symmetry, counting, or spatial manipulation. Unlike traditional machine learning benchmarks, ARC emphasizes generalization and reasoning over statistical pattern recognition, making it particularly challenging for current AI systems. The repository also includes a browser-based interface that allows humans to attempt solving the tasks manually, providing a baseline for comparison.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 20
    KubeEdge

    KubeEdge

    Kubernetes Native Edge Computing Framework (project under CNCF)

    ...It consists of a cloud part and an edge part, and provides core infrastructure support for networking, application deployment, and metadata synchronization between the cloud and edge. It also supports MQTT which enables edge devices to access through edge nodes. With KubeEdge it is easy to get and deploy existing complicated machine learning, image recognition, event processing, and other high-level applications to the Edge. With business logic running at the Edge, much larger volumes of data can be secured & processed locally where the data is produced. With data processed at the Edge, the responsiveness is increased dramatically and data privacy is protected.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Bandicoot

    Bandicoot

    fast C++ library for GPU linear algebra & scientific computing

    * Fast GPU linear algebra library (matrix maths) for the C++ language, aiming towards a good balance between speed and ease of use * Provides high-level syntax and functionality deliberately similar to Matlab * Provides an API that is aiming to be compatible with Armadillo for easy transition between CPU and GPU linear algebra code * Useful for algorithm development directly in C++, or quick conversion of research code into production environments * Distributed under the permissive Apache 2.0 license, useful for both open-source and proprietary (closed-source) software * Can be used for machine learning, pattern recognition, computer vision, signal processing, bioinformatics, statistics, finance, etc * Downloads: http://coot.sourceforge.io/download.html * Documentation: http://coot.sourceforge.io/docs.html * Bug reports: http://coot.sourceforge.io/faq.html * Git repo: https://gitlab.com/conradsnicta/bandicoot-code
    Downloads: 5 This Week
    Last Update:
    See Project
  • 22
    MediaPipe Face Detection

    MediaPipe Face Detection

    Detect faces in an image

    The MediaPipe Face Detection model is a high-performance, real-time face detection solution that uses machine learning to identify faces in images and video streams. It is optimized for mobile and embedded platforms, offering fast and accurate face detection while maintaining a small memory footprint. This model supports multiple face detections and is highly efficient, making it suitable for a variety of applications such as augmented reality, user authentication, and facial expression analysis.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 23
    pattern_classification

    pattern_classification

    A collection of tutorials and examples for solving machine learning

    The pattern_classification repository is an educational project that provides tutorials, examples, and reference materials related to machine learning and statistical pattern recognition. The project aims to help learners understand the process of building predictive models by presenting structured explanations and practical examples. It includes notebooks and guides that demonstrate data preprocessing, feature extraction, model training, and evaluation techniques used in machine learning workflows. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Audio AI Timeline

    Audio AI Timeline

    A timeline of the latest AI models for audio generation

    ...Rather than functioning as a model training framework, it serves as an informational resource that maps key papers, systems, models, datasets, and milestones across areas such as speech synthesis, music generation, audio understanding, source separation, and general audio machine learning. The project helps users understand how major techniques and ideas evolved over time, making it especially useful for researchers, students, and practitioners who want a broad overview of the field without digging through scattered references. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Paper-with-Code-of-Wireless-comm

    Paper-with-Code-of-Wireless-comm

    Paper-with-Code-of-Wireless-communication-Based-on-DL

    Paper-with-Code-of-Wireless-communication-Based-on-DL is a curated repository that collects research papers and corresponding code implementations related to the application of deep learning in wireless communication systems. The project aims to help researchers and graduate students quickly find reproducible implementations of algorithms used in modern communication research. Wireless communication research has increasingly adopted deep learning techniques to address complex tasks such as...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • Next
MongoDB Logo MongoDB