Join/Login
Business Software
Open Source Software
For Vendors
Blog
About
More

For Vendors Help Create Join Login

Business Software

Open Source Software

SourceForge Podcast

Resources

Articles
Case Studies
Blog

Menu

Help
Create
Join
Login

Home
Open Source Software
Search Results

Search Results for "speech & image processing based project in matlab"

x

Sort By:

Relevance

Clear All Filters

OS

ChromeOS 35
BSD 35
Linux 35
More...
Mac 35
Windows 35
Desktop Operating Systems 1
Embedded Operating Systems 1

Category

Artificial Intelligence 17
Multimedia 9
Scientific/Engineering 9
Business 5
Education 4
Software Development 4
Database 2
Internet 1
Social sciences 1
Text Editors 1

License

OSI-Approved Open Source 29
Creative Commons Attribution License 1
GNU Free Documentation License 1
Other License 1

Translations

English 5
Brazilian Portuguese 1
French 1
Indonesian 1
More...
Persian 1
Russian 1

Programming Language

Python 16
MATLAB 7
C++ 3
Java 3
More...
TypeScript 2
C 1
C# 1
JavaScript 1
Perl 1
PHP 1
Unix Shell 1

Status

Production/Stable 7
Pre-Alpha 2
Beta 2
Alpha 1
More...
Mature 1

35 projects for "speech & image processing based project in matlab" with 1 filter applied:

ChromeOS Clear Filters & Widen Search

The All-In-One Google Workspace Management Tool for IT Admins
Our advanced administration makes adding, deleting, suspending, and de-provisioning users simple and quick.

gPanel by Promevo streamlines administration, security, and user management, giving organizations full control over their Google Workspace.

Learn More
Contract Management Software | Concord
AI-powered contract management that helps businesses track spending, negotiate smarter, and never miss deadlines.

Concord serves small and mid-sized businesses and Fortune 500 companies. This robust, web-based platform is used by human resource, sales, procurement, and legal teams, and virtually anyone who deals with contracts.

Learn More
1

AI Runner

Offline inference engine for art, real-time voice conversations

AI Runner is an offline inference engine designed to run a collection of AI workloads on your own machine, including image generation for art, real-time voice conversations, LLM-powered chatbots and automated workflows. It is implemented as a desktop-oriented Python application and emphasizes privacy and self-hosting, allowing users to work with text-to-speech, speech-to-text, text-to-image and multimodal models without sending data to external services. ...

Downloads: 13 This Week

Last Update: 2025-12-11
See Project
2

WhisperJAV

Uses Qwen3-ASR, local LLM, Whisper, TEN-VAD

WhisperJAV is an open-source speech transcription pipeline designed specifically for generating subtitles for Japanese adult video content. The project addresses challenges that standard speech recognition models face when transcribing this type of audio, which often includes low signal-to-noise ratios and large numbers of non-verbal vocalizations. Traditional automatic speech recognition systems can misinterpret these sounds as words, leading to inaccurate transcripts. WhisperJAV introduces...

Downloads: 22 This Week

Last Update: 5 days ago
See Project
3

Orpheus TTS

Towards Human-Sounding Speech

Orpheus TTS is a state-of-the-art open-source text-to-speech system built on a Llama-3B backbone, treating speech synthesis as a large language model problem instead of a traditional TTS pipeline. It is designed to produce human-like speech with natural intonation, emotion, and rhythm, targeting quality comparable to or better than many closed-source systems. The project ships both pretrained and finetuned English models, as well as a family of multilingual models released as a research...

Downloads: 1 This Week

Last Update: 2025-12-05
See Project
4

comfyui-mixlab-nodes

Workflow and speech recognition app

comfyui-mixlab-nodes is a large collection of custom nodes for ComfyUI that turns workflows into interactive apps and adds real-time multimedia, LLM, and TTS capabilities. It introduces a “Workflow-to-APP” concept, where a ComfyUI graph can be transformed into a Web App through an AppInfo node, complete with categories, batch prompts, and editable configurations. The project also brings Real-time Design features like screen capture and floating video nodes, enabling creative pipelines that...

Downloads: 11 This Week

Last Update: 2025-11-28
See Project
Houzz Pro is the #1 business management software for home construction and design professionals.
Get the all-in-one tool for marketing, project and client management built specifically for remodeling and design professionals.

Get an all-in-one solution that spans the full customer lifecycle, including marketing, CRM, estimation & proposal building, project management, a 3D Floor Plan builder, an online invoicing and payment portal, as well as a client portal and collaboration tools. Start a free trial today to see why thousands of Pros run their business on Houzz Pro. Plans available for all business sizes.

Learn More
5

Streamer-Sales

LLM Large Model of Selling Anchor

Streamer-Sales is an open-source large language model system designed specifically for e-commerce live streaming and automated product promotion. The project focuses on generating persuasive product descriptions and live presentation scripts that mimic the style of professional online sales hosts. By analyzing product characteristics and marketing information, the model can produce engaging explanations that emphasize benefits, features, and emotional appeal to encourage viewers to make...

Downloads: 7 This Week

Last Update: 2026-03-05
See Project
6

Luna AI

Virtual AI anchor that combines state-of-the-art technology

Luna AI is a virtual AI streamer framework designed to power an interactive VTuber that can go live on major platforms and chat with viewers in real time. It is built around a core assistant persona called “Luna AI,” which can be driven by a wide range of large language models and platforms, including GPT-style APIs, Claude, LangChain-based backends, ChatGLM, Kimi, Ollama, and many others. The project supports multiple rendering backends for the avatar, such as Live2D, Unreal Engine (UE),...

Downloads: 18 This Week

Last Update: 2025-11-28
See Project
7

pyAudioAnalysis

Python Audio Analysis Library: Feature Extraction, Classification

pyAudioAnalysis is an open-source Python library designed for audio signal analysis, machine learning, and music information retrieval tasks. The project provides a collection of tools that allow developers to extract meaningful features from audio files and use those features for classification, segmentation, and analysis. The library supports multiple audio processing workflows, including feature extraction from raw audio signals, training of machine learning models, and automatic audio...

Downloads: 3 This Week

Last Update: 2026-03-10
See Project
8

HivisionIDPhoto

HivisionIDPhotos: a lightweight and efficient AI ID photos tools

...The software analyzes portrait images, performs background removal, aligns the face according to ID photo standards, and produces images in various official size formats. It also allows the generation of layout sheets such as six-inch photo arrangements for printing multiple ID photos on a single page. The project focuses on building a practical pipeline for automated ID photo production using AI-based segmentation and image processing techniques.

Downloads: 7 This Week

Last Update: 2026-03-10
See Project
9

AI App Lab

Implementing large models into scenario-based applications

AI App Lab is an open-source platform developed by Volcengine that provides tools, SDKs, and example applications for building real-world AI applications powered by large language models. The project focuses on helping developers bridge the gap between AI models and practical business use cases by offering a structured environment for creating production-ready AI systems. It includes a high-level SDK called Arkitect, which provides workflows and tools for integrating models, plugins, and...

Downloads: 7 This Week

Last Update: 2026-03-17
See Project
Network Performance Monitoring | Statseeker
Statseeker is a powerful network performance monitoring solution for businesses

Using just a single server or virtual machine, Statseeker can be up and running within minutes, and discovering your entire network in less than an hour, without any significant effect on your bandwidth availability.

Learn More
10

Open Vision Agents by Stream

Build Vision Agents quickly with any model or video provider

Open Vision Agents by Stream is an open source framework from Stream for building real time, multimodal AI agents that watch, listen, and respond to live video streams. It focuses on combining video understanding models, such as YOLO and Roboflow based detectors, with real time large language models like OpenAI Realtime and Gemini Live to create interactive experiences. The framework uses Stream’s ultra low latency edge network so agents can join sessions quickly and maintain very low audio...

Downloads: 9 This Week

Last Update: 22 hours ago
See Project
11

React Native AI

Full stack framework for building cross-platform mobile AI apps

React Native AI is a full-stack framework designed to simplify the development of AI-powered mobile applications using React Native. The project provides a ready-to-use infrastructure for building cross-platform apps that integrate large language models and other AI services. It supports real-time streaming responses from multiple AI providers and enables developers to build chat interfaces, AI-driven image generation tools, and natural language features within mobile apps. The framework...

Downloads: 0 This Week

Last Update: 2026-03-09
See Project
12

Advanced AI explainability for PyTorch

Advanced AI Explainability for computer vision

pytorch-grad-cam is an open-source library that provides advanced explainable AI techniques for interpreting the predictions of deep learning models used in computer vision. The project implements Grad-CAM and several related visualization methods that highlight the regions of an image that most strongly influence a neural network’s decision. These visualization techniques allow developers and researchers to better understand how convolutional neural networks and transformer-based vision...

Downloads: 0 This Week

Last Update: 2026-03-29
See Project
13

Roadmap To Learn Generative AI In 2025

Basic Machine Learning Natural Language Processing Roadmap

Roadmap To Learn Generative AI In 2025 is a curated learning path focused on contemporary generative AI — covering large language models (LLMs), diffusion-based image generation, prompt engineering, multi-modal AI, fine-tuning techniques, and the practical considerations for deploying generative models. It’s aimed at learners and developers who already have some programming or ML basics and wish to specialize in generative AI, offering a modern, structured plan that reflects the state of the...

Downloads: 0 This Week

Last Update: 2025-12-02
See Project
14

Perfect Pixel

Refine and quantize messy AI pixel art into clean, perfect pixels

perfectPixel is a workflow tool for turning messy “pixel-style” images, especially those produced by generative models, into truly grid-aligned pixel art that reads cleanly at any scale. It tackles a common problem with AI pixel art: edges that look pixelated at first glance but are not actually aligned to a coherent pixel grid, which causes shimmer, blur, and uneven block sizes when you zoom in. The tool analyzes an image to infer the intended grid size, then refines and quantizes the...

Downloads: 1 This Week

Last Update: 2026-02-01
See Project
15

ProStack

ProStack - a platform for image processing and analysis

ProStack - a platform for image processing and analysis. It implements various image processing methods as separate modules, that can be joined in a complex image processing scenario by use of a graphical user interface. RPMs are available at https://build.opensuse.org/project/repositories/home:mackoel:compbio

2 Reviews

Downloads: 0 This Week

Last Update: 2025-10-29
See Project
16

SigPack

SigPack - A signal processing library using Armadillo

SigPack is a C++ signal processing library using the Armadillo library as a base. The API will be familiar for those who has used IT++ and Octave/Matlab.

2 Reviews

Downloads: 6 This Week

Last Update: 2026-02-27
See Project
17

GeoTools, the Java GIS toolkit

Toolkit for working with and mapping geospatial data

GeoTools is an open source (LGPL) Java code library which provides standards compliant methods for the manipulation of geospatial data. GeoTools is an Open Source Geospatial Foundation project. The GeoTools library data structures are based on Open Geospatial Consortium (OGC) specifications.

38 Reviews

Downloads: 130 This Week

Last Update: 2026-03-19
See Project
18

EmotiVoice

Multi-Voice and Prompt-Controlled TTS Engine

EmotiVoice is a multi-voice, prompt-controlled text-to-speech engine designed to generate highly expressive speech across thousands of voices. It supports both English and Chinese and ships with over 2,000 preset voices, making it suitable for everything from characters and virtual anchors to narration and dialogue. The core idea is prompt-based emotional and style control: you can ask the engine to speak “happy,” “sad,” “excited,” or with other high-level style prompts that shape prosody,...

Downloads: 2 This Week

Last Update: 2025-11-30
See Project
19

ekho

Chinese text-to-speech engine

ekho is a project with relatively sparse documentation, but from the repository it appears to be a small-scale tool for audio processing and playback, possibly with features for speech synthesis or manipulation. The repo includes scripts and configuration files suggesting interactions with media/audio handling libraries. Because of limited README detail, it seems targeted at users comfortable reading and modifying code, rather than end users expecting polished UIs. ...

Downloads: 3 This Week

Last Update: 2025-11-28
See Project
20

NaveGo

NaveGo: an open source MATLAB/GNU Octave toolbox for processing integr

NaveGo is an open source MATLAB/GNU Octave toolbox designed for processing integrated navigation systems, simulating inertial sensors and GNSS receivers, and profiling inertial sensors using methods like Allan variance—providing a community-driven simulation framework for navigation system design and analysis. I am reaching out to share an important update regarding the NaveGo project. Due to a shift in both my professional career and personal interests away from navigation systems, I have...

Downloads: 0 This Week

Last Update: 2025-09-08
See Project
21

SVoice (Speech Voice Separation)

We provide a PyTorch implementation of the paper Voice Separation

SVoice is a PyTorch-based implementation of Facebook Research’s study on speaker voice separation as described in the paper “Voice Separation with an Unknown Number of Multiple Speakers.” This project presents a deep learning framework capable of separating mixed audio sequences where several people speak simultaneously, without prior knowledge of how many speakers are present. The model employs gated neural networks with recurrent processing blocks that disentangle voices over multiple...

Downloads: 0 This Week

Last Update: 4 days ago
See Project
22

Music Source Separation

Separate audio recordings into individual sources

Music Source Separation is a PyTorch-based open-source implementation for the task of separating a music (or audio) recording into its constituent sources — for example isolating vocals, instruments, bass, accompaniment, or background from a mixed track. It aims to give users the ability to take any existing song and decompose it into separate stems (vocals, accompaniment, etc.), or to train custom separation models on their own datasets (e.g. for speech enhancement, instrument isolation, or...

Downloads: 2 This Week

Last Update: 2025-12-02
See Project
23

VGGFace2

VGGFace2 Dataset for Face Recognition

VGGFace2 is a large-scale face recognition dataset developed to support research on facial recognition across variations in pose, age, illumination, and identity. It consists of 3.31 million images covering 9,131 subjects, with an average of over 360 images per subject. The dataset was collected from Google Image Search, ensuring a wide diversity in ethnicity, profession, and real-world conditions. It is split into a training set with 8,631 identities and a test set with 500 identities,...

Downloads: 28 This Week

Last Update: 4 days ago
See Project
24

QtiPlot

QtiPlot is a user-friendly, platform independent data analysis and visualization application similar to the non-free Windows program Origin.

9 Reviews

Downloads: 88 This Week

Last Update: 2020-06-04
See Project
25

mzitu

Python crawler that downloads image galleries and analyzes titles

mzitu is a Python-based web crawling project designed to automatically download and organize image galleries from a specific photography site. It demonstrates how to build a scraper that navigates gallery pages, retrieves image links, and saves the images locally in a structured directory layout. It focuses on automating the collection of large sets of images by programmatically parsing page content and iterating through gallery entries. mzitu also includes a simple analysis script that processes downloaded folder names to generate statistics and visualizations. ...

Downloads: 1 This Week

Last Update: 6 days ago
See Project

Previous
You're on page 1
2
Next

Related Searches

ai

ekho

qtiplot

ai offline

geotools

offline ai

ai chatbot offline

forensic audio analysis

image 2d to 3d converter

c++ spectrogram

Related Categories

Artificial Intelligence

Multimedia

Scientific/Engineering

Business

Education

SourceForge

Create a Project
Open Source Software
Business Software
Top Downloaded Projects

Company

About
Team
SourceForge Headquarters
1320 Columbia Street Suite 310
San Diego, CA 92101
+1 (858) 422-6466

Resources

Support
Site Documentation
Site Status
SourceForge Reviews

© 2026 Slashdot Media. All Rights Reserved.

Terms Privacy Privacy Choices Advertise