PHPScraper is a universal web-scraping util for PHP, built with simplicity in mind. The goal is to make xPath Selectors optional and avoid the commonly needed boilerplate code. Just create an instance of PHPScraper, go to a website, and start collecting data. All scraping functionality can be accessed either as a function call or a property call. For example, the title can be accessed in two ways.
Open source web scraping system for automated data collection tasks
SkyCaiji is an open source web scraping and data collection system designed to gather information from websites through configurable extraction rules. It focuses on simplifying the process of building crawlers by allowing users to visually define scraping rules rather than writing complex code. It can collect structured or unstructured data from many types of webpages and automate the extraction process for large datasets. SkyCaiji is designed to run on a variety of hosting environments...
Web Crawling, Web Testing, and Web Scraping application
Blackfire Player is a powerful Web Crawling, Web Testing, and Web Scraper application. It provides a nice DSL to crawl HTTP services, assert responses, and extract data from HTML/XML/JSON responses.
Some Blackfire Player use cases:
Crawl a website/API and check expectations -- aka Acceptance Tests;
Scrape a website/API and extract values;
Monitor a website;
Test code with unit test integration (PHPUnit, Behat, Codeception, ...);
Test code behavior from the outside thanks to the...