Showing 62 open source projects for "parser"

View related business solutions
  • GWI: On-demand Consumer Research Icon
    GWI: On-demand Consumer Research

    For marketing agencies and media organizations requiring a solution to get consumer insights

    Need easy access to consumer insights? Our intuitive platform is the answer. Get the ultra-reliable research that brands and agencies need to stay ahead of changing consumer behavior.
    Learn More
  • Supercharge Your Manufacturing with Easy MRP and MES Software Icon
    Supercharge Your Manufacturing with Easy MRP and MES Software

    Designed for SME manufacturers who want to reduce wasteful manual processing, save time and increase profits.

    Flowlens eliminates stock-outs, shortage and overstocks, avoiding costly production delays. Stay in control of inventory levels and keep production running smoothly with real-time visibility and easy-to-use stock management. Import bulk data with ease.
    Learn More
  • 1
    MegaParse

    MegaParse

    File Parser optimised for LLM Ingestion with no loss

    MegaParse is a file parser optimized for Large Language Model (LLM) ingestion, ensuring no loss of information. It efficiently parses various document formats, such as PDFs, DOCX, and PPTX, converting them into formats ideal for processing by LLMs. This tool is essential for applications that require accurate and comprehensive data extraction from diverse document types.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 2
    Qwen Code

    Qwen Code

    Qwen Code is a coding agent that lives in the digital world

    Qwen Code is a command-line AI workflow tool designed to enhance developer productivity by leveraging the power of Qwen3-Coder models. Adapted from the Google Gemini CLI, it features an enhanced parser optimized specifically for Qwen-Coder models, enabling deep code understanding and manipulation. The tool supports querying and editing large codebases beyond traditional context limits, making it ideal for modern, complex projects. Qwen Code automates various development workflows, including handling pull requests and performing complex git rebases. ...
    Downloads: 18 This Week
    Last Update:
    See Project
  • 3
    spaCy

    spaCy

    Industrial-strength Natural Language Processing (NLP)

    ...It comes with pretrained statistical models and word vectors, convolutional neural network models, easy deep learning integration and so much more. spaCy is the fastest syntactic parser in the world according to independent benchmarks, with an accuracy within 1% of the best available. It's blazing fast, easy to install and comes with a simple and productive API.
    Downloads: 76 This Week
    Last Update:
    See Project
  • 4
    BudouX

    BudouX

    Standalone, small, language-neutral

    Standalone. Small. Language-neutral. BudouX is the successor to Budou, the machine learning-powered line break organizer tool. It is standalone. It works with no dependency on third-party word segmenters such as Google cloud natural language API. It is small. It takes only around 15 KB including its machine learning model. It's reasonable to use it even on the client-side. It is language-neutral. You can train a model for any language by feeding a dataset to BudouX’s training...
    Downloads: 5 This Week
    Last Update:
    See Project
  • deskbird is the most intuitive desk booking app for your hybrid office. Icon
    deskbird is the most intuitive desk booking app for your hybrid office.

    With deskbird, creating an efficient workplace has never been easier.

    For companies in need of a people-centric workplace management solution so employees can see who is in the office, schedule their office and work-from-home days, and book resources for office days.
    Learn More
  • 5
    LiteParse

    LiteParse

    A fast, helpful, and open-source document parser

    LiteParse is an open-source lightweight parsing library designed to extract structured data from unstructured text using large language models in an efficient and cost-effective manner. It focuses on simplifying the process of turning raw text into structured outputs such as JSON by providing a streamlined interface for prompt-based parsing. The system is designed to minimize overhead, making it suitable for applications where performance and cost are critical considerations. LiteParse...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 6
    td

    td

    Telegram client, in Go. (MTProto API)

    Telegram MTProto API client in Go for users and bots.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 7
    tika-python

    tika-python

    Python binding to the Apache Tika™ REST services

    A Python port of the Apache Tika library that makes Tika available using the Tika REST Server. This makes Apache Tika available as a Python library, installable via Setuptools, Pip and easy to install. To use this library, you need to have Java 7+ installed on your system as tika-python starts up the Tika REST server in the background. To get this working in a disconnected environment, download a tika server file (both tika-server.jar and tika-server.jar.md5, which can be found here) and set...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    langrocks

    langrocks

    Tools like web browser, computer access and code runner for LLMs

    Langrocks is a programming language experimentation toolkit that enables developers to create, test, and optimize custom programming languages.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    LlamaParse

    LlamaParse

    Parse files for optimal RAG

    LlamaParse is a GenAI-native document parser that can parse complex document data for any downstream LLM use case (RAG, agents). Load in 160+ data sources and data formats, from unstructured, and semi-structured, to structured data (API's, PDFs, documents, SQL, etc.) Store and index your data for different use cases. Integrate with 40+ vector stores, document stores, graph stores, and SQL db providers.
    Downloads: 2 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Learn More
  • 10
    Agents-Flex

    Agents-Flex

    Agents-Flex is an elegant LLM Application Framework like LangChain

    ...Agents-Flex has a very flexible Function Calling component. It supports local method definitions, parsing, callbacks through LLMs, and executing local methods to obtain results. Agents-Flex offers Loader, Parser, and Splitter components for the Document. Each component has multiple implementations, making it easy to load data from the web, local files, databases, and various data types.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 11
    Dicio assistant

    Dicio assistant

    Dicio assistant app for Android

    Dicio is a free and open source voice assistant for Android that focuses on strong privacy by running its understanding and response generation directly on the device whenever possible. It supports multiple input and output methods, including hotword-based voice input using the Vosk speech-to-text engine and a graphical interface for users who prefer to tap instead of talk. The assistant is built around a flexible “skills” system that lets it respond to a wide variety of requests such as...
    Downloads: 11 This Week
    Last Update:
    See Project
  • 12
    Qwen3-Coder

    Qwen3-Coder

    Qwen3-Coder is the code version of Qwen3

    Qwen3-Coder is the latest and most powerful agentic code model developed by the Qwen team at Alibaba Cloud. Its flagship version, Qwen3-Coder-480B-A35B-Instruct, features a massive 480 billion-parameter Mixture-of-Experts architecture with 35 billion active parameters, delivering top-tier performance on coding and agentic tasks. This model sets new state-of-the-art benchmarks among open models for agentic coding, browser-use, and tool-use, matching performance comparable to leading models...
    Downloads: 15 This Week
    Last Update:
    See Project
  • 13
    xgplayer

    xgplayer

    A HTML5 video player with a parser that saves traffic

    xgplayer is a web-friendly, open-source media player library maintained by ByteDance, designed for playing audio/video streams in browsers or web applications with robust control, flexibility, and extensibility. It abstracts many of the lower-level complexities of HTML5 media, providing a consistent API for playback control, custom UI overlays, adaptive streaming, plugin hooks, and cross-browser compatibility. Because of its emphasis on modularity and extensibility, xgplayer can be embedded...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 14
    OSS-Fuzz Gen

    OSS-Fuzz Gen

    LLM powered fuzzing via OSS-Fuzz

    OSS-Fuzz-Gen is a companion project that helps automatically create or improve fuzz targets for open-source codebases, aiming to increase coverage in OSS-Fuzz with minimal maintainer effort. It analyses a library’s APIs, examples, and tests to propose harnesses that exercise parsers, decoders, or protocol handlers—precisely the code where fuzzing pays off. The system integrates with modern LLM-assisted workflows to draft harness code and then iterates based on build errors or low coverage...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    pdf-extractor

    pdf-extractor

    Node.js module for rendering pdf pages to images, svgs and HTML files

    ...A DOM Canvas is used to render and export the graphical layer of the pdf. Canvas exports *.png as a default but can be extended to export to other file types like .jpg. Pdf objects are converted to svg using the SVGGraphics parser of pdf.js. Pdf text is converted to HTML. This can be used as a (transparent) layer over the image to enable text selection. Pdf text is extracted to a text file for different usages (e.g. indexing the text). This library is in it's most basic form a node.js wrapper for pdf.js. It has default renderers to generate a default output, but is easily extended to incorporate custom logic or to generate different output. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 16
    LayoutParser

    LayoutParser

    A Unified Toolkit for Deep Learning Based Document Image Analysis

    With the help of state-of-the-art deep learning models, Layout Parser enables extracting complicated document structures using only several lines of code. This method is also more robust and generalizable as no sophisticated rules are involved in this process. A complete instruction for installing the main Layout Parser library and auxiliary components. Learn how to load DL Layout models and use them for layout detection.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 17
    libpostal

    libpostal

    A C library for parsing/normalizing street addresses around the world

    A C library for parsing/normalizing street addresses around the world. Powered by statistical NLP and open geo data. libpostal is a C library for parsing/normalizing street addresses around the world using statistical NLP and open data. The goal of this project is to understand location-based strings in every language, everywhere. Addresses and the locations they represent are essential for any application dealing with maps (place search, transportation, on-demand/delivery services,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Self-Attentive Parser

    Self-Attentive Parser

    High-accuracy NLP parser with models for 11 languages

    LightAutoML is an automated machine learning (AutoML) framework developed by Sberbank AI Lab, designed to facilitate the development of machine learning models with minimal human intervention.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Leseratte is a Java parser for German written language. Currently, it contains a German lexicon (based on the Wiktionary), inflexion rules, a grammar and a parser. (Semantics component planned.) Usable as a Java library, also provides a graphical UI.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Provides a GUI interface to grammatical structure and relations (as parsed by the Stanford Parser) of any text.Contains grammatical relation editor to modify, import, export grammatical relation definitions (tregex patterns and features).
    Leader badge
    Downloads: 1 This Week
    Last Update:
    See Project
  • 21
    Bangla TTS

    Bangla TTS

    Bangla text to speech synthesis in python

    Bangla text to speech Multilingual (Bangla, English) real-time ([almost] in a GPU) speech synthesis library. Installation -------------------------------------- * Install Anaconda * conda create -n new_virtual_env python==3.6.8 * conda activate new_virtual_env * pip install -r requirements.txt * While running for the first time, keep your internet connection on to download the weights of the speech synthesis models (>500 MB) * For...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 22
    PyResParser

    PyResParser

    A simple resume parser used for extracting information from resumes

    PyResParser is a simple resume parser that extracts information from resumes, aiding in the automation of resume-processing tasks.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    AIML_chung

    AIML_chung

    an AIML chatbot engine with 3D avatars, maths parser, speech and dll

    AIML chung is an full AIML1.0 based standalone chat bot engine trial with dll , tts / espeak speech voices, synonyms substitutions, maths parser and 3D photorealistics openGL avatars written in compiled freebasic.Comes with GUI window and console examples, 3D world mode and a dll version to use with other programming languages like c++ or Liberty Basic , or to easily embed in your applications .Talk with your A.I. computer.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    SLING

    SLING

    A natural language frame semantics parser

    The aim of the SLING project is to learn to read and understand Wikipedia articles in many languages for the purpose of knowledge base completion, e.g. adding facts mentioned in Wikipedia (and other sources) to the Wikidata knowledge base. We use frame semantics as a common representation for both knowledge representation and document annotation. The SLING parser can be trained to produce frame semantic representations of text directly without any explicit intervening linguistic representation. The SLING project is still work in progress. We do not yet have a full system that can extract facts from arbitrary text, but we have built a number of the subsystems needed for such a system. The SLING frame store is our basic framework for building and manipulating frame semantic graph structures. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Telegram::Bot

    Telegram::Bot

    Ruby gem for building Telegram Bot with optional Rails integration

    Tools for developing Telegram bots. Best used with Rails, but can be used in a standalone app. Supposed to be used in webhook mode in production, and poller mode in development, but you can use poller in production if you want.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • Next
MongoDB Logo MongoDB