A GUI tool for extracting hard-coded subtitle (hardsub) from videos
Convert AI papers to GUI
OCR expert VLM powered by Hunyuan's native multimodal architecture
Deep Learning API and Server in C++14 support for Caffe, PyTorch
Qwen3-VL, the multimodal large language model series by Alibaba Cloud
A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming
Vision utilities for web interaction agents
Qwen3-omni is a natively end-to-end, omni-modal LLM
Easy Tools of PDF, Image, File, Network, Data, and Medias
CCTV Footage Timestamp Search Tool
Free RPA tool by AI Singapore
Easy-OCR solution and Tesseract trainer for GNU/Linux
Free subtitle editor
A Tailored Small Linux for Beagleboard-xm
Post-OCR correction tool for SRT subtitles