(logo)
(navigation image)
Home Wayback Machine | Blog | Researcher Access | FreeCache | Community Wireless | Petabox | Heritrix | Open Source Media | BookMobile

Search: Advanced Search

UploadAnonymous User (login or join us) 
 Reference Links
Researcher access is currently not available pending redesign. This material has been retained for reference and was current information as of late 2002.

Data Available
Tools Available
Example Projects
Tool Documentation
Example Code
av_procarc
USAGE av_procarc [-E] [-nmd] [-b robotFile] [-x xreffile] [-r <redirect linkfile>] arcFileName [linkfile]
File must be of the form *.arc.gz and writes dat.gz file to stdout.
-E stands for evil and means to follow links even when a nofollow meta tag is present
The xreffile, if specified, will be filled with (link, url) pairs, giving each link and the url it was scraped from.
The redirect file get all links from pages with 3xx response codes.
The linkfile gets all links scraped, unless they were written to the redirect filename

Terms of Use (10 Mar 2001)