Smart Cache Loader is a very configurable pure Java web grabber with special support for integration with Smart Cache proxy server. It can perform different loading operations based on URL mask, content-type, ...
cpDetector is a proxy for codepage detection of documents. It delegates to multiple instances that try to detect the codepage by different techinques. A command line executeable is shipped that allows to sort documents by codepage.
webStraktor is a programmable World Wide Web data extraction client. Its purpose is to scrape HTML based content via the HTTP protocol and extract relevant information. webStraktor features a scripting language to facilitate the collection, the extraction and the storage of information available on the web, including images. The scripting language uses elements of the Regular Expression and xPath syntax. The webStraktor scripting language has a small instruction set and its syntax is easy...
Vodcatcher Helper is a proxy server for media centers. It parses web pages for videos and provides them to the media center software. Supported media center softwares are VDR and XBMC.