Crawlee is a web scraping and browser automation library. It helps you build reliable crawlers. Fast. Crawlee won't fix broken selectors for you (yet), but it helps you build and maintain your crawlers faster. When a website adds JavaScript rendering, you don't have to rewrite everything, only switch to one of the browser crawlers. When you later find a great API to speed up your crawls, flip the switch back. It keeps your proxies healthy by rotating them smartly with good fingerprints that make your crawlers look human-like. It's not unblockable, but it will save you money in the long run. Crawlee is built by people who scrape for a living and use it every day to scrape millions of pages. Meet our community on Discord. We believe websites are best scraped in the language they're written in. Crawlee runs on Node.js and it's built in TypeScript to improve code completion in your IDE, even if you don't use TypeScript yourself.

Features

  • JavaScript & TypeScript
  • HTTP scraping
  • Headless browsers
  • Automatic scaling and proxy management
  • Queue and Storage
  • Helpful utils and configurability

Project Samples

Project Activity

See All Activity >

License

Apache License V2.0

Follow crawlee

crawlee Web Site

Other Useful Business Software
Our xDM platform turns business users into data champions. Icon
Our xDM platform turns business users into data champions.

Discover the Intelligent Data Hub unique platform for Master Data Management

It empowers organizations of any size to build trusted data applications quickly, with fast time to value using a single software platform for governance, master data, reference data, data quality, enrichment, and workflows.
Learn More
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of crawlee!

Additional Project Details

Programming Language

TypeScript

Related Categories

TypeScript Web Scrapers, TypeScript Headless Browsers

Registered

2023-04-12