Showing 175 open source projects for "pdf tool python"

View related business solutions
  • The AI-powered unified PSA-RMM platform for modern MSPs. Icon
    The AI-powered unified PSA-RMM platform for modern MSPs.

    Trusted PSA-RMM partner of MSPs worldwide

    SuperOps.ai is the only PSA-RMM platform powered by intelligent automation and thoughtfully crafted for the new-age MSP. The platform also helps MSPs manage their projects, clients, and IT documents from a single place.
    Learn More
  • Run applications fast and securely in a fully managed environment Icon
    Run applications fast and securely in a fully managed environment

    Cloud Run is a fully-managed compute platform that lets you run your code in a container directly on top of scalable infrastructure.

    Run frontend and backend services, batch jobs, deploy websites and applications, and queue processing workloads without the need to manage infrastructure.
    Try for free
  • 1
    XX-Net

    XX-Net

    A web proxy tool

    XX-Net is an easy-to-use, anti-censorship web proxy tool from China. It includes GAE_proxy and X-Tunnel, with support for multiple platforms.
    Downloads: 63 This Week
    Last Update:
    See Project
  • 2
    changedetection.io

    changedetection.io

    The best free open source website change detection and restock service

    Loved by smart shoppers, data journalists, research engineers, data scientists, security researchers, and more. From simply monitoring website pages that have a change (such as watching prices, and restocking notifications), to deep inspection such as PDF text support, JSON and XML monitoring, and extensive text triggers. Monitor out-of-stock products and get alerts when those products are back in stock, get restock alerts via Discord, Slack, email, and many other platforms. Using the...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 3
    theHarvester

    theHarvester

    E-mails, subdomains and names

    theHarvester is a very simple to use, yet powerful and effective tool designed to be used in the early stages of a penetration test or red team engagement. Use it for open source intelligence (OSINT) gathering to help determine a company's external threat landscape on the internet. The tool gathers emails, names, subdomains, IPs and URLs using multiple public data sources.
    Downloads: 46 This Week
    Last Update:
    See Project
  • 4
    SimpDL

    SimpDL

    A tool to scrape images from SimpCity

    SimpDL is an open-source media downloading tool designed to retrieve content from subscription-based or creator platforms, focusing on simplicity and ease of use. It enables users to download images, videos, and other media associated with specific creators or accounts, often through authenticated sessions. The project emphasizes a straightforward workflow where users provide login credentials or tokens, and the tool handles the retrieval and storage of content automatically. It is designed...
    Downloads: 1 This Week
    Last Update:
    See Project
  • The AI workplace management platform Icon
    The AI workplace management platform

    Plan smart spaces, connect teams, manage assets, and get insights with the leading AI-powered operating system for the built world.

    By combining AI workflows, predictive intelligence, and automated insights, OfficeSpace gives leaders a complete view of how their spaces are used and how people work. Facilities, IT, HR, and Real Estate teams use OfficeSpace to optimize space utilization, enhance employee experience, and reduce portfolio costs with precision.
    Learn More
  • 5
    videodl

    videodl

    Lightweight Python tool for downloading videos from many platforms

    Videodl is a lightweight video downloader implemented entirely in Python that allows users to retrieve videos from a wide range of online media platforms. It focuses on providing a fast and simple way to parse video pages and download media files, often prioritizing high-definition versions without watermarks when available. It supports numerous video platforms across both Chinese and international streaming ecosystems, enabling users to fetch content from many popular services through a...
    Downloads: 10 This Week
    Last Update:
    See Project
  • 6
    FinalRecon

    FinalRecon

    All-in-one Python web reconnaissance tool for fast target analysis

    FinalRecon is an all-in-one web reconnaissance tool written in Python that helps security professionals gather information about a target website quickly and efficiently. It combines multiple reconnaissance techniques into a single command-line utility so users do not need to run several separate tools to collect similar data. FinalRecon focuses on providing a fast overview of a web target while maintaining accuracy in the collected results.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 7
    Gobuster

    Gobuster

    Directory/File, DNS and VHost busting tool written in Go

    Gobuster is a tool used to brute-force. This project is born out of the necessity to have something that didn't have a fat Java GUI (console FTW), something that did not do recursive brute force, something that allowed me to brute force folders and multiple extensions at once, something that compiled to native on multiple platforms, something that was faster than an interpreted script (such as Python), and something that didn't require a runtime.
    Downloads: 39 This Week
    Last Update:
    See Project
  • 8
    douyin

    douyin

    Open source Douyin crawler for collecting and downloading public data

    DouyinCrawler is an open source data collection tool designed to gather publicly available information from the Douyin platform. It demonstrates how to build a Python-based web crawler combined with a graphical interface and command line functionality. It allows users to collect data from various types of Douyin content, including user profiles, videos, hashtags, and music pages. DouyinCrawler supports both automated scraping and batch operations to process multiple targets efficiently. ...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 9
    Trafilatura

    Trafilatura

    Python & command-line tool to gather text on the Web

    Trafilatura is a Python package and command-line tool designed to gather text on the Web. It includes discovery, extraction and text-processing components. Its main applications are web crawling, downloads, scraping, and extraction of main texts, metadata and comments. It aims at staying handy and modular: no database is required, the output can be converted to various commonly used formats.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Get full visibility and control over your tasks and projects with Wrike. Icon
    Get full visibility and control over your tasks and projects with Wrike.

    A cloud-based collaboration, work management, and project management software

    Wrike offers world-class features that empower cross-functional, distributed, or growing teams take their projects from the initial request stage all the way to tracking work progress and reporting results.
    Learn More
  • 10
    OpenWPM

    OpenWPM

    A web privacy measurement framework

    OpenWPM is a web privacy measurement framework that makes it easy to collect data for privacy studies on a scale of thousands to millions of websites. OpenWPM is built on top of Firefox, with automation provided by Selenium. It includes several hooks for data collection. Check out the instrumentation section below for more details. OpenWPM is tested on Ubuntu 18.04 via TravisCI and is commonly used via the docker container that this repo builds, which is also based on Ubuntu. Although we...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 11
    news-please

    news-please

    Python tool for crawling and extracting structured data from news site

    news-please is an open source news crawler and information extraction tool designed to collect and structure articles from online news websites. It provides an integrated pipeline that crawls news sites, retrieves article pages, and extracts structured information such as headlines, authors, publication dates, and article text. news-please can recursively follow internal links and read RSS feeds to gather both recent and archived articles from a news outlet when given only the root URL of a site. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Symfony Panther

    Symfony Panther

    A browser testing and web crawling library for PHP and Symfony

    Symfony Panther is a browser testing and web scraping tool that allows developers to interact with websites programmatically. It uses headless Chrome or Firefox to automate browser tasks, making it suitable for end-to-end testing and data extraction. Panther integrates well with Symfony and PHPUnit, allowing developers to write comprehensive tests for web applications.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    mitmproxy

    mitmproxy

    A free and open source interactive HTTPS proxy

    mitmproxy is an open source, interactive SSL/TLS-capable intercepting HTTP proxy, with a console interface fit for HTTP/1, HTTP/2, and WebSockets. It's the ideal tool for penetration testers and software developers, able to debug, test, and make privacy measurements. It can intercept, inspect, modify and replay web traffic, and can even prettify and decode a variety of message types. Its web-based interface mitmweb gives you a similar experience as Chrome's DevTools, with the addition of...
    Downloads: 14 This Week
    Last Update:
    See Project
  • 14
    Bili23 Downloader

    Bili23 Downloader

    Cross platform GUI tool for downloading videos from Bilibili sites

    Bili23-Downloader is an open source desktop application designed for downloading video content from the Bilibili platform. It provides a graphical interface that allows users to download various types of media including user-uploaded videos, series episodes, movies, and other hosted content. It focuses on ease of use with a zero-configuration setup, making it accessible to both beginners and experienced users. It supports high performance downloads through multi-threading and includes resume...
    Downloads: 20 This Week
    Last Update:
    See Project
  • 15
    HTTPie Desktop

    HTTPie Desktop

    Cross-platform API testing client for humans

    HTTPie Desktop is a graphical API client built on top of the popular HTTPie terminal tool, offering a user-friendly interface for testing and interacting with APIs. It combines the simplicity of HTTPie’s CLI with a modern desktop and web UI for a more visual workflow. Developers can easily build, send, and preview HTTP requests without needing to memorize commands or write scripts. The platform supports organizing work into spaces, collections, and tabs, making it ideal for managing multiple...
    Downloads: 13 This Week
    Last Update:
    See Project
  • 16
    Browserless

    Browserless

    The headless Chrome/Chromium driver on top of Puppeteer

    Browserless is an open-source headless browser automation library and service built on top of Puppeteer that simplifies the process of running and scaling Chromium-based browser tasks in production environments. It provides a high-level API for interacting with headless Chrome, allowing developers to perform operations such as generating PDFs, capturing screenshots, extracting text or HTML, and automating web navigation. The project is designed to act as a production-ready abstraction layer...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 17
    Scout Suite

    Scout Suite

    Multi-cloud security auditing tool

    Scout Suite is an open-source multi-cloud security-auditing tool, which enables security posture assessment of cloud environments. Using the APIs exposed by cloud providers, Scout Suite gathers configuration data for manual inspection and highlights risk areas. Rather than going through dozens of pages on the web consoles, Scout Suite presents a clear view of the attack surface automatically. Scout Suite was designed by security consultants/auditors. It is meant to provide a point-in-time...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 18
    Toot

    Toot

    toot - Mastodon CLI & TUI

    Toot is a CLI and TUI tool for interacting with Mastodon instances from the command line.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Linkwarden

    Linkwarden

    Self-hosted collaborative bookmark manager

    Linkwarden is a self-hosted, open-source bookmark manager built to help individuals and teams collect, organize, and preserve important web content in a way that stays useful long after the original pages change or disappear. Instead of saving only a URL, it captures durable archived formats so your saved knowledge remains accessible even when link rot happens. The experience is designed to feel like a modern “read-it-later” tool, with a reader view that makes long articles easier to consume...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 20
    Hoverfly

    Hoverfly

    Lightweight service virtualization/ API simulation / API mocking tool

    Hoverfly is a lightweight, open source API simulation tool. Using Hoverfly, you can create realistic simulations of the APIs your application depends on. Replace unreliable test systems and restrictive API sandboxes with high-performance simulations in seconds. Run on MacOS, Windows or Linux, or use native Java or Python language bindings to get started quickly. Simulate API latency or failure when required by writing custom scripts in the language of your choice.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 21
    MDCx

    MDCx

    Movie metadata scraper and organizer for media libraries and NFO

    MDCx is an open source media metadata scraping and organization tool designed to automate the process of collecting detailed information for movie files. It retrieves metadata from multiple online sources and applies it to local media collections, helping users maintain structured and well-organized libraries. MDCx can download information such as titles, cast data, artwork, and other metadata, then generate standardized NFO files compatible with media management systems. It also supports...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 22
    CyberScraper 2077

    CyberScraper 2077

    A Powerful web scraper powered by LLM | OpenAI, Gemini & Ollama

    CyberScraper 2077 is not just another web scraping tool – it's a glimpse into the future of data extraction. Born from the neon-lit streets of a cyberpunk world, this AI-powered scraper uses OpenAI, Gemini and LocalLLM Models to slice through the web's defenses, extracting the data you need with unparalleled precision and style.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 23
    Weibo Crawler

    Weibo Crawler

    Python crawler for collecting and downloading Sina Weibo user data

    weibo-crawler is a Python-based data collection tool designed to retrieve information from Sina Weibo user accounts. It automates the process of gathering posts, user profile details, and engagement metrics from one or more target accounts. weibo-crawler can extract comprehensive information about users, including profile attributes such as nickname, follower count, following count, and account metadata.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    AWS SAM CLI

    AWS SAM CLI

    CLI tool to build, test, debug, and deploy Serverless applications

    The AWS Serverless Application Model (SAM) CLI is an open-source CLI tool that helps you develop serverless applications containing Lambda functions, Step Functions, API Gateway, EventBridge, SQS, SNS and more. The AWS Serverless Application Model (SAM) is an open-source framework for building serverless applications. It provides shorthand syntax to express functions, APIs, databases, and event source mappings. With just a few lines per resource, you can define the application you want and...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 25
    diskover-community

    diskover-community

    Open source file indexing & storage analytics powered by Elasticsearch

    ...It allows administrators and users to explore file structures, monitor storage usage, and gain insights into how data is distributed across infrastructure. By indexing file metadata from sources such as local file systems, network shares like NFS and SMB, and cloud storage, the tool provides a centralized way to analyze heterogeneous storage environments. Diskover also helps identify outdated or unused files, duplicate data, and inefficient storage usage that can waste resources or increase operational costs. A Python-based indexing engine performs the scanning and indexing tasks.
    Downloads: 1 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next
MongoDB Logo MongoDB