Search Results for "character recognition code"

Showing 456 open source projects for "character recognition code"

View related business solutions
  • Monitoring, Securing, Optimizing 3rd party scripts Icon
    Monitoring, Securing, Optimizing 3rd party scripts

    For developers looking for a solution to monitor, script, and optimize 3rd party scripts

    c/side is crawling many sites to get ahead of new attacks. c/side is the only fully autonomous detection tool for assessing 3rd party scripts. We do not rely purely on threat feed intel or easy to circumvent detections. We also use historical context and AI to review the payload and behavior of scripts.
    Learn More
  • Mavenlink | Project Management Software Icon
    Mavenlink | Project Management Software

    Connecting People, Projects, and Profits

    Mavenlink is an innovative online resource management and project management software built for professional services teams. Offering a better way to manage projects and resources, Mavenlink transforms businesses by combining project management, collaboration, time tracking, resource management, and project financials all in one place.
    Get Started Today
  • 1
    Tesseract OCR

    Tesseract OCR

    Open Source OCR Engine

    Tesseract is an open source OCR or optical character recognition engine and command line program. OCR is a technology that allows for the recognition of text characters within a digital image. With the latest version of Tesseract, there is a greater focus on line recognition, however it still supports the legacy Tesseract OCR engine which recognizes character patterns. Tesseract can recognize over 100 languages out-of-the-box, and can be trained to recognize other languages. ...
    Downloads: 3,117 This Week
    Last Update:
    See Project
  • 2
    SimpleHTR

    SimpleHTR

    Handwritten Text Recognition (HTR) system implemented with TensorFlow

    ...It also employs connectionist temporal classification (CTC) to align predicted character sequences with input images without requiring character-level segmentation. The repository provides code for training models, performing inference on handwritten text images, and evaluating recognition accuracy. SimpleHTR is commonly used as an educational example for understanding how modern handwriting recognition systems operate.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Umi-OCR

    Umi-OCR

    OCR software, free and offline

    Umi-OCR is a free and open-source optical character recognition (OCR) tool designed to provide fast, offline text extraction from images, screenshots, PDFs, and more without requiring a network connection. It includes a highly efficient offline OCR engine with built-in multilingual recognition libraries, so users can extract text across multiple languages with high accuracy directly on their machines.
    Downloads: 54 This Week
    Last Update:
    See Project
  • 4
    PaddleOCR

    PaddleOCR

    Awesome multilingual OCR toolkits based on PaddlePaddle

    PaddleOCR offers exceptional, multilingual, and practical Optical Character Recognition (OCR) tools that can help users train better models and apply them into practice. Inspired by PaddlePaddle, PaddleOCR is an ultra lightweight OCR system, with multilingual recognition, digit recognition, vertical text recognition, as well as long text recognition. It features a PPOCR series of high-quality pre-trained models, which includes: ultra lightweight ppocr_mobile series models, general ppocr_server series models, and ultra lightweight compression ppocr_mobile_slim series models. ...
    Downloads: 83 This Week
    Last Update:
    See Project
  • ERP Software To Simplify Your Manufacturing Icon
    ERP Software To Simplify Your Manufacturing

    From quote to cash and with AI in mind, our ERP software will become the most valuable asset at your company.

    Global Shop Solutions AI-integrated ERP software provides the applications needed to deliver a quality part on time, every time from quote to cash and everything in between, including shop management, scheduling, inventory, accounting, quality control, CRM and 25 more.
    Learn More
  • 5
    DeepSeek-OCR

    DeepSeek-OCR

    Contexts Optical Compression

    DeepSeek-OCR is an open-source optical character recognition solution built as part of the broader DeepSeek AI vision-language ecosystem. It is designed to extract text from images, PDFs, and scanned documents, and integrates with multimodal capabilities that understand layout, context, and visual elements beyond raw character recognition. The system treats OCR not simply as “read the text” but as “understand what the text is doing in the image”—for example distinguishing captions from body text, interpreting tables, or recognizing handwritten versus printed words. ...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 6
    InsightFace

    InsightFace

    State-of-the-art 2D and 3D Face Analysis Project

    State-of-the-art deep face analysis library. InsightFace is an open-source 2D&3D deep face analysis library. InsightFace is an integrated Python library for 2D&3D face analysis. InsightFace efficiently implements a wide variety of state-of-the-art algorithms for face recognition, face detection, and face alignment, which are optimized for both training and deployment. Research institutes and industrial organizations can get benefits from InsightFace library.
    Downloads: 495 This Week
    Last Update:
    See Project
  • 7
    Tesseract.js

    Tesseract.js

    A pure Javascript Multilingual OCR

    Tesseract.js is a pure Javascript port of the popular Tesseract OCR engine. Tesseract.js' library supports more than 100 languages, automatic text orientation and script detection, a simple interface for reading paragraph, word, and character bounding boxes. Tesseract.js can run either in a browser and on a server with NodeJS. Tesseract.js is a javascript library that gets words in almost any spoken language out of images. The main Tesseract.js functions (ex. recognize, detect) take an image...
    Downloads: 18 This Week
    Last Update:
    See Project
  • 8
    Open Semantic Search

    Open Semantic Search

    Open source semantic search and text analytics for large document sets

    ...Open Semantic Search includes an ETL framework that can ingest documents, process them through analysis steps, and enrich the data with extracted information such as named entities and metadata. It also supports optical character recognition to extract text from images and scanned documents, including images embedded inside PDF files. It integrates text mining and analytics capabilities that allow users to examine relationships, topics, and structured data within document collections.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 9
    GLM-OCR

    GLM-OCR

    Accurate × Fast × Comprehensive

    GLM-OCR is an open-source multimodal optical character recognition (OCR) model built on a GLM-V encoder–decoder foundation that brings robust, accurate document understanding to complex real-world layouts and modalities. Designed to handle text recognition, table parsing, formula extraction, and general information retrieval from documents containing mixed content, GLM-OCR excels across major benchmarks while remaining highly efficient with a relatively compact parameter size (~0.9B), enabling deployment in high-concurrency services and edge environments. ...
    Downloads: 10 This Week
    Last Update:
    See Project
  • Instant Remote Support Software. Unattended Remote Access Software. Icon
    Instant Remote Support Software. Unattended Remote Access Software.

    Zoho Assist, your all-in-one remote access solution, helps you to access and manage remote devices.

    Zoho Assist is cloud-based remote support and remote access software that helps you support customers from a distance through web-based, on-demand remote support sessions. Set up unattended remote access and manage remote PCs, laptops, mobile devices, and servers effortlessly. A few seconds is all you need to establish secure connections to offer your customers remote support solutions.
    Learn More
  • 10
    HunyuanOCR

    HunyuanOCR

    OCR expert VLM powered by Hunyuan's native multimodal architecture

    HunyuanOCR is an open-source, end-to-end OCR (optical character recognition) Vision-Language Model (VLM) developed by Tencent‑Hunyuan. It’s designed to unify the entire OCR pipeline, detection, recognition, layout parsing, information extraction, translation, and even subtitle or structured output generation, into a single model inference instead of a cascade of separate tools. Despite being fairly lightweight (about 1 billion parameters), it delivers state-of-the-art performance across a wide variety of OCR tasks, outperforming many traditional OCR systems and even other multimodal models on benchmark suites. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 11
    whisper.cpp

    whisper.cpp

    Port of OpenAI's Whisper model in C/C++

    whisper.cpp is a lightweight, C/C++ reimplementation of OpenAI’s Whisper automatic speech recognition (ASR) model—designed for efficient, standalone transcription without external dependencies. The entire high-level implementation of the model is contained in whisper.h and whisper.cpp. The rest of the code is part of the ggml machine learning library. The command downloads the base.en model converted to custom ggml format and runs the inference on all .wav samples in the folder samples. whisper.cpp supports integer quantization of the Whisper ggml models. ...
    Downloads: 371 This Week
    Last Update:
    See Project
  • 12
    DeepSeek-OCR 2

    DeepSeek-OCR 2

    Visual Causal Flow

    DeepSeek-OCR-2 is the second-generation optical character recognition system developed to improve document understanding by introducing a “visual causal flow” mechanism, enabling the encoder to reorder visual tokens in a way that better reflects semantic structure rather than strict raster scan order. It is designed to handle complex layouts and noisy documents by giving the model causal reasoning capabilities that mimic human visual scanning behavior, enhancing OCR performance on documents with rich spatial structure. ...
    Downloads: 14 This Week
    Last Update:
    See Project
  • 13
    Concordia

    Concordia

    Crowdsourcing platform for full text transcription and tagging

    ...It was developed by the Library of Congress so that volunteers of all backgrounds could transcribe and tag digitized images of manuscripts and typed materials from the Library’s collections that could not otherwise be done by optical character recognition.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Self-Operating Computer

    Self-Operating Computer

    A framework to enable multimodal models to operate a computer

    ...Notably, it was the first known project to implement a multimodal model capable of viewing and controlling a computer screen. The framework supports features like Optical Character Recognition (OCR) and Set-of-Mark (SoM) prompting to enhance visual grounding capabilities. It is designed to be compatible with macOS, Windows, and Linux (with X server installed), and is released under the MIT license.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 15
    Pot Desktop

    Pot Desktop

    A cross-platform software for text translation and recognition

    Pot-Desktop is a cross-platform productivity tool aimed at helping users quickly translate, perform OCR (optical character recognition), and synthesize speech for selected text or images — all with minimal friction. It supports picking text via mouse selection (“highlight-and-translate”), clipboard listening, or screenshot-based OCR; this makes it ideal for reading webpages, documents, images — or any on-screen text — and instantly getting translations or text extraction. ...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 16
    SikuliX

    SikuliX

    SikuliX version 2.0.0+ (2019+)

    SikuliX automates anything you see on the screen of your desktop computer running Windows, Mac or some Linux/Unix. It uses image recognition powered by OpenCV to identify GUI components and can act on them with mouse and keyboard actions. This is handy in cases when there is no easy access to a GUI's internals or the source code of the application or web page you want to act on.
    Downloads: 140 This Week
    Last Update:
    See Project
  • 17
    OCRmyPDF

    OCRmyPDF

    OCRmyPDF adds an OCR text layer to scanned PDF files

    OCRmyPDF adds an optical character recognition (OCR) text layer to scanned PDF files, allowing them to be searched. PDF is the best format for storing and exchanging scanned documents. Unfortunately, PDFs can be difficult to modify. OCRmyPDF makes it easy to apply image processing and OCR (recognized, searchable text) to existing PDFs.
    Downloads: 91 This Week
    Last Update:
    See Project
  • 18
    LLM-Aided OCR Project

    LLM-Aided OCR Project

    Enhances Tesseract OCR output using LLMs (local or API)

    LLM Aided OCR is an open-source system designed to improve optical character recognition accuracy by combining traditional OCR tools with large language models. The project addresses common OCR challenges such as distorted text, unusual fonts, historical documents, and complex layouts that often produce inaccurate results with standard OCR pipelines. The system first extracts raw text using OCR engines and then applies language models to analyze and correct recognition errors based on context. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Open-LLM-VTuber

    Open-LLM-VTuber

    Open source AI VTuber platform with voice chat and Live2D avatars

    Open-LLM-VTuber is an open source platform designed to create AI-powered VTuber characters that can interact with users through voice and animated avatars. It enables hands-free conversations with large language models by combining speech recognition, language processing, and text-to-speech synthesis into a single system. Users can speak directly to the AI character, and the system can respond with a generated voice while animating a Live2D avatar to simulate a talking virtual personality. Open-LLM-VTuber is modular, allowing developers to swap or configure different language models, speech recognition engines, and voice synthesis systems depending on their needs. ...
    Downloads: 14 This Week
    Last Update:
    See Project
  • 20
    Rapid LaTeX OCR

    Rapid LaTeX OCR

    Formula recognition based on LaTeX-OCR and ONNXRuntime

    Formula recognition based on LaTeX-OCR and ONNXRuntime. rapid_latex_ocr is a tool to convert formula images to latex format. The reasoning code in the repo is modified from LaTeX-OCR, the model has all been converted to ONNX format, and the reasoning code has been simplified, Inference is faster and easier to deploy. The repo only has codes based on ONNXRuntime or OpenVINO inference in onnx format and does not contain training model codes.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 21
    SCAIL

    SCAIL

    Towards Studio-Grade Character Animation via In-Context Learning of 3D

    SCAIL is a project developed by the ZAI Organization, focusing on AI-driven research initiatives. While specific documentation about SCAIL’s exact goals and implementation is limited from the repository context alone, the project appears to be part of a collection of machine learning and AI research tools that facilitate scalable model development, evaluation, or application workflows. Given its listing alongside other ZAI projects like speech recognition and text-to-speech systems, SCAIL...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Text2Code for Jupyter notebook

    Text2Code for Jupyter notebook

    A proof-of-concept jupyter extension which converts english queries

    ...The system uses natural language processing techniques to identify the intent of the query, extract relevant variables, and map the request to predefined code templates. Technologies such as sentence embeddings and named entity recognition are used to interpret user instructions and construct appropriate code outputs.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Agently

    Agently

    AI Agent Application Development Framework

    Build AI agent native application in very little code. Easy to interact with AI agents in code using structure data and chained-calls syntax. Enhance AI Agent using plugins instead of rebuilding a whole new agent. Agently is a development framework that helps developers build AI agent native applications really fast. You can use and build AI agents in your code in an extremely simple way.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Google AI Edge Gallery

    Google AI Edge Gallery

    A gallery that showcases on-device ML/GenAI use cases

    ...The project bundles runnable samples that show how to run TensorFlow Lite/Edge TPU models (and similar lightweight runtimes) on mobile and embedded platforms, demonstrating common tasks like image classification, object detection, audio recognition, and pose estimation. Each sample is intended to be both a learning aid and a practical starting point: code is organized to show model loading, pre/post-processing, performance measurement, and common optimization knobs (quantization, NNAPI/Delegate usage, and hardware accelerators). The repo also collects small, well-documented models and conversion scripts so developers can reproduce a pipeline from a full-size model down to a device-friendly artifact.
    Downloads: 1,018 This Week
    Last Update:
    See Project
  • 25
    MBeautifier

    MBeautifier

    MBeautifier is a MATLAB source code formatter, beautifier

    MBeautifier is a lightweight M-Script-based MATLAB source code formatter usable directly in the MATLAB Editor.
    Downloads: 4 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next