|
|
|
| Datagrab Indexer - Web Crawler, Indexer and Search Engine Extracts urls from the web and start building a search index. Crawl up to more than a 100,000 documents to produce a lightning fast index, which will be searchable via a front end web interface. Supports AND, OR, NOT, Phrase and Fuzzy Search through an advance ruleset configurations. Pack with many other features too long to be mentioned here. This system has been proven to fulfill the needs of almost any website out there. |
|
|
|
| Hits:
60 |
|
| Ratings:
3.50 |
|
 |
|
|
|
|
|    Date Added: Feb-17-2006 |
|
|
|
|
|
| Set a starting point and let iuCrawler do the rest. Retrieve information from websites, build databases quickly, accurately and as often as you require. Completely customizable with html templates. Seperate spider and crawler for maximum performance and ease of use. Export your data to all popular formats including My SQL, PostgreSQL, MS Excel and more. |
|
|
|
| Hits:
729 |
|
| Ratings:
0.00 |
|
 |
|
|
|
|
|    Date Added: Jul-26-2005 |
|
|
|
|
|
| Harvest-NG is a collection of Perl modules and scripts which provide a powerful web crawling and summarizing agent. The code is aimed at providing an open source, standards compliant, tool for fetching content from a wide variety of information sources, summarising it into a set of resource descriptions, and storing these in an easily accessible database from which search services can be built and statistical information compiled. |
|
|
|
| Hits:
934 |
|
| Ratings:
4.00 |
|
 |
|
|
|
|
|    Date Added: Jul-19-2005 |
|
|
|
|
|
| I-Spy is a Perl script which identifies new files on various remote FTP and Web sites. It grabs and compares contents of FTP directories and web pages. It will then compile a report and either send it via e-mail or save it as a web page. You may also request both deliveries of the report. For e-mail reports, you may request plain text or HTML. I-Spy logs its activity as it chugs along. You may specify the log directory, or I-Spy will try to find one automatically. For web page reports, I-Spy will attempt to store the log in such a place where it may be referenced by the report and served by the web server. |
|
|
|
| Hits:
979 |
|
| Ratings:
0.00 |
|
 |
|
|
|
|
|    Date Added: Aug-18-2005 |
|
|
|
|
|
| This is a proof-of-concept of a tool to automate web browsing / data collection. It works like AWK except that instead of working on files and lines it works on HTML pages and hyperlinks. It is meant to be run as a command line script and includes base_url - the URL the script was initially invoked on, base_path - root of saved data tree, url - current URL being processed, linked_from - parent of current URL, and content - the actual data corresponding to the current URL. |
|
|
|
| Hits:
787 |
|
| Ratings:
0.00 |
|
 |
|
|
|
|
|    Date Added: Jul-29-2005 |
|
|
|
|
|
| Web Secretary is a web page monitoring software. However, it goes beyond the normal functionalities offered by such software. Not only does it detect changes based on content analysis (instead of date/time stamp or simple textual comparison), it will email the changed page to you with the new contents highlighted. Web Secretary is written in Perl and should be able to run on all Unix systems with the Perl interpreter (and LWP module) installed. |
|
|
|
| Hits:
626 |
|
| Ratings:
5.00 |
|
 |
|
|
|
|
|    Date Added: Jun-17-2005 |
|
|
|
|
|