Showing 195 open source projects for "web indexing"

View related business solutions
  • Network Management Software and Tools for Businesses and Organizations | Auvik Networks Icon
    Network Management Software and Tools for Businesses and Organizations | Auvik Networks

    Mapping, inventory, config backup, and more.

    Reduce IT headaches and save time with a proven solution for automated network discovery, documentation, and performance monitoring. Choose Auvik because you'll see value in minutes, and stay with us to improve your IT for years to come.
    Learn More
  • Run applications fast and securely in a fully managed environment Icon
    Run applications fast and securely in a fully managed environment

    Cloud Run is a fully-managed compute platform that lets you run your code in a container directly on top of scalable infrastructure.

    Run frontend and backend services, batch jobs, deploy websites and applications, and queue processing workloads without the need to manage infrastructure.
    Try for free
  • 1
    Spotweb

    Spotweb

    Decentralized community

    Spotweb is an open-source PHP-based web interface for the Usenet indexing service Spotnet, allowing users to search, browse, and download NZBs from Usenet content feeds. It provides a full-featured, self-hosted Usenet indexing system that supports user accounts, moderation, comments, and custom filtering. Spotweb makes it easy for users to run their own NZB indexing server and integrates with download clients like SABnzbd or NZBGet.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 2
    fess

    fess

    Open source enterprise search server for websites, files, and data

    ...It also provides a web-based administrative interface that allows administrators to configure crawling targets, manage indexing tasks, and adjust search settings from a graphical dashboard.
    Downloads: 13 This Week
    Last Update:
    See Project
  • 3
    Search-Index

    Search-Index

    A persistent, network resilient, full text search library

    Search-Index is a lightweight and fast JavaScript-based search engine that enables full-text search indexing and retrieval for web applications.
    Downloads: 11 This Week
    Last Update:
    See Project
  • 4
    diskover-community

    diskover-community

    Open source file indexing & storage analytics powered by Elasticsearch

    Diskover Community Edition is an open source file system indexing and storage analytics platform designed to help organizations understand and manage large volumes of file data. It crawls file systems and indexes metadata using Elasticsearch, enabling fast search, analysis, and organization of files stored across different storage systems. It allows administrators and users to explore file structures, monitor storage usage, and gain insights into how data is distributed across...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Get full visibility and control over your tasks and projects with Wrike. Icon
    Get full visibility and control over your tasks and projects with Wrike.

    A cloud-based collaboration, work management, and project management software

    Wrike offers world-class features that empower cross-functional, distributed, or growing teams take their projects from the initial request stage all the way to tracking work progress and reporting results.
    Learn More
  • 5
    Anna’s Archive

    Anna’s Archive

    Comprehensive search engine for books, papers, comics, magazines

    Anna’s Archive is a large-scale open-source search engine and data aggregation platform designed to index and provide access to a vast collection of books, academic papers, comics, magazines, and other digital texts through a unified interface. The project includes all the infrastructure required to run a full instance locally or in production, combining web servers, databases, and search indexing systems into a scalable architecture. It relies heavily on technologies such as Elasticsearch for search functionality and MariaDB for structured data storage, enabling fast and efficient querying across massive datasets. The system is designed with redundancy and replication in mind, allowing distributed deployments and mirrored environments to handle high traffic and large data volumes. ...
    Downloads: 117 This Week
    Last Update:
    See Project
  • 6
    ROMM

    ROMM

    A beautiful, powerful, self-hosted rom manager and player

    ...It reimagines the home screen with adaptive layouts, predictive app recommendations, and dynamic organization so that frequently used tools are always within reach. The launcher includes a powerful universal search that combs through installed apps, contacts, messages, and web results to deliver quick answers without switching contexts. Romm also supports widgets, customization options, and theme choices so users can tailor the visual experience to their preferences while maintaining performance and responsiveness. Privacy is a highlight, with local indexing and search functions that operate without sending data to external servers unless explicitly permitted.
    Downloads: 10 This Week
    Last Update:
    See Project
  • 7
    mgrep

    mgrep

    A calm, CLI-native way to semantically grep everything, like code

    ...Built with a focus on calm CLI experiences, it lets you index and query your local files with semantic understanding, delivering results that are relevant to your intent rather than simple pattern matches, which is especially powerful in large or diverse projects. It also includes features such as background indexing to keep your search index up to date without interrupting your workflow and web search integration to expand the scope of queries beyond local files. Designed for both programmers and agents, it integrates naturally into development and research workflows while offering thoughtful defaults that keep output clean and informative.
    Downloads: 12 This Week
    Last Update:
    See Project
  • 8
    Text Search Engine

    Text Search Engine

    A text search engine that supports mixed Chinese and English search

    Text-Search-Engine is a JavaScript-based lightweight search engine that enables full-text search functionality. It allows developers to implement fast search indexing and retrieval in web applications.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Bazarr

    Bazarr

    Bazarr is a companion application to Sonarr and Radarr

    ...Once you configure the languages and quality rules for your media library, Bazarr continuously monitors new episodes or releases indexed by those applications and automatically finds and fetches matching subtitle files, including external upgrades when better options appear. It offers both automatic and manual search capabilities through a modern web interface, letting you fine-tune what gets downloaded and stored alongside your video files. Bazarr supports a wide array of subtitle providers around the world and can track download history, manage multiple languages, and perform post-download cleanup if needed. Because it doesn’t itself scan your disk libraries but instead relies on Sonarr or Radarr indexing, it fits cleanly into automated media stacks on NAS, servers.
    Downloads: 8 This Week
    Last Update:
    See Project
  • SoftCo: Enterprise Invoice and P2P Automation Software Icon
    SoftCo: Enterprise Invoice and P2P Automation Software

    For companies that process over 20,000 invoices per year

    SoftCo Accounts Payable Automation processes all PO and non-PO supplier invoices electronically from capture and matching through to invoice approval and query management. SoftCoAP delivers unparalleled touchless automation by embedding AI across matching, coding, routing, and exception handling to minimize the number of supplier invoices requiring manual intervention. The result is 89% processing savings, supported by a context-aware AI Assistant that helps users understand exceptions, answer questions, and take the right action faster.
    Learn More
  • 10
    RavenDB

    RavenDB

    ACID Document Database

    A NoSQL document database designed for high-performance, real-time applications with built-in distributed capabilities.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 11
    Memvid

    Memvid

    Video-based AI memory library. Store millions of text chunks in MP4

    Memvid encodes text chunks as QR codes within MP4 frames to build a portable “video memory” for AI systems. This innovative approach uses standard video containers and offers millisecond-level semantic search across large corpora with dramatically less storage than vector DBs. It's self-contained—no DB needed—and supports features like PDF indexing, chat integration, and cloud dashboards.
    Downloads: 50 This Week
    Last Update:
    See Project
  • 12
    Midarr Server

    Midarr Server

    Midarr, the minimal lightweight media server

    Midarr is a minimal, lightweight media server built to complement tools like Radarr or Sonarr. Instead of reinventing the media management stack, it leverages existing setups and metadata providers to serve media files "fresh off the metal" without re-indexing or transcoding by default. It offers a sleek web interface with authentication, user profiles, real-time statuses, and experimental support for remuxing/transcoding and Chromecast compatibility.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 13
    Just the Class

    Just the Class

    A modern, highly customizable, responsive Jekyll template

    A modern, highly customizable, responsive Jekyll template for course websites. Just the Class is a GitHub Pages template developed for the purpose of quickly deploying course websites. In addition to serving plain web pages and files, it provides a boilerplate for announcements, course calendar, etc. Just the Class is a template that extends the popular Just the Docs theme, which provides a robust and thoroughly-tested foundation for your website.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 14
    OpenArchiver

    OpenArchiver

    An open-source platform for legally compliant email archiving

    ...It’s designed for scenarios where reliable, tamper-proof archiving and full-text search across both emails and attachments are essential for legal discovery, compliance, or long-term records retention. The platform combines a modern web UI with powerful backend services, including fast indexing, deduplication, encryption at rest, and asynchronous ingestion workflows, making it suitable for both small teams and enterprise deployments. Beyond simply capturing email, it emphasizes security and auditability with features like secure storage formats, file integrity verification, and detailed audit trails of user interactions.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 15
    GPT Crawler

    GPT Crawler

    Crawl a site to generate knowledge files to create your own custom GPT

    GPT Crawler is an open-source tool designed to automatically crawl websites and generate structured knowledge that can be used to build AI assistants and retrieval systems. It focuses on extracting high-quality textual content from web pages and preparing it in formats suitable for embedding, indexing, or fine-tuning workflows. The project is especially useful for teams that want to turn documentation sites or knowledge bases into conversational AI backends without building custom scrapers from scratch. It includes configurable crawling logic, content filtering, and output pipelines that streamline the process of preparing data for large language models. ...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 16
    Scribe.js

    Scribe.js

    JavaScript OCR and text extraction for images and PDFs

    ...The library can take image files (such as PNG or JPEG) and recognize the text they contain, and it can also extract text from PDF files that either already contain text or are image-based scans, using modern web standards and WebAssembly under the hood. In addition to simple text extraction, Scribe.js supports writing or injecting a high-quality invisible text layer back into PDFs, effectively making them searchable and improving usability for indexing or accessibility. It is written in modern ECMAScript Modules (ESM), so it can be imported in both browser and Node.js environments without a build step, though browser usage requires same-origin hosting of the files.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 17
    WeKnora

    WeKnora

    LLM framework for document understanding and semantic retrieval

    WeKnora is an open source framework developed for deep document understanding and semantic information retrieval using large language models. It focuses on analyzing complex and heterogeneous documents by combining multiple processing stages such as multimodal document parsing, vector indexing, and intelligent retrieval. It follows the Retrieval-Augmented Generation (RAG) paradigm, where relevant document segments are retrieved and used by language models to generate accurate, context-aware...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 18
    Brokk

    Brokk

    Brokk brings code intelligence to AI

    Brokk is a code intelligence assistant framework designed to let large language models (LLMs) understand code semantically (not just as raw text) so that they can work effectively on large codebases that don’t fit wholly in a prompt context. It helps bridge the gap between LLMs and real-world engineering code by offering tooling to index, analyze, query, and augment code context, so that AI can meaningfully reason about existing code, suggest edits, and navigate across projects. Modular...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 19
    BestBlogs

    BestBlogs

    A collection of top programming

    BestBlogs is an open-source project designed to aggregate, organize, and surface high-quality blog content from across the web, helping users discover valuable articles in a structured and accessible way. The platform focuses on curating content based on relevance, quality, and usefulness rather than simply indexing large volumes of information, making it particularly useful for developers, researchers, and knowledge seekers. It typically integrates automated data collection and filtering mechanisms to gather blog posts from multiple sources, then categorizes and ranks them to improve discoverability. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    LangChain-ChatGLM-Webui

    LangChain-ChatGLM-Webui

    Automatic question answering for local knowledge bases based on LLM

    LangChain-ChatGLM-Webui is an open-source web interface that integrates the ChatGLM large language model with the LangChain framework to create an interactive conversational AI platform. The project provides a graphical interface that allows users to interact with language models through chat sessions while also connecting those models to external knowledge sources. It supports retrieval-augmented generation workflows that enable the system to answer questions based on local documents or...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Dozzle

    Dozzle

    Realtime log viewer for containers. Supports Docker, Swarm and K8s

    Dozzle is a lightweight, self-hosted web application for real-time viewing and monitoring of container logs, focused on speed and simplicity rather than building a full log storage pipeline. Instead of indexing or storing logs, it connects to your container runtime and streams live output so you can diagnose issues as they happen. The interface includes practical quality-of-life features like fuzzy searching for containers, regex log search, split-screen viewing for multiple logs, and live stats such as CPU and memory usage. ...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 22
    GooFuzz

    GooFuzz

    OSINT fuzzing tool using Google dorks to find exposed resources

    ...It is written in Bash and automates the use of Google Dorking queries to discover publicly accessible information related to a target domain. Instead of directly sending requests to the target server, GooFuzz gathers results through search engine indexing, allowing enumeration without leaving traces in the target’s server logs. This method enables the discovery of potentially sensitive files, directories, subdomains, and parameters that are already exposed on the web. By combining wordlists, search operators, and file extension filters, the tool helps security professionals locate misconfigured or unintentionally exposed resources. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 23
    Forge Code

    Forge Code

    AI enabled pair programmer for Claude, GPT, O Series, Grok, Deepseek

    Forge is a modern, open-source tool that brings AI-powered code assistance directly into your terminal workflow, effectively turning your shell into a “pair programmer”, without ever leaving your development environment. Written in Rust (with a command-line interface), Forge integrates with your existing shell (bash, zsh, fish, etc.) or IDE-agnostic workflows, allowing you to interact with your codebase, command-line tools, and version control as usual, but with the added support of large...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 24
    ink-kit

    ink-kit

    Onchain-focused SDK with ready-to-use templates and themes

    ink-kit is a developer toolkit for building applications on the INK blockchain ecosystem, bundling the pieces you typically need to go from a blank repo to a working dapp. It provides contract templates, deployment scripts, and client SDKs so you can iterate on on-chain logic and a frontend without stitching together disparate tools. The kit standardizes project layout and environment configuration, making local development, testing, and staging deploys predictable. Utilities for wallet...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 25
    RBush

    RBush

    High-performance JavaScript R-tree-based 2D spatial index

    RBush is a high-performance JavaScript library for 2D spatial indexing of points and rectangles. It's based on an optimized R-tree data structure with bulk insertion support. Spatial index is a special data structure for points and rectangles that allows you to perform queries like "all items within this bounding box" very efficiently (e.g. hundreds of times faster than looping over all items). It's most commonly used in maps and data visualizations.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next
MongoDB Logo MongoDB