| |
|
Poster:
|
Brad
|
Date:
|
December 24, 2003 07:29:40am
|
|
Forum:
|
web
|
Subject: |
Re: Broken/truncated .zip files in Wayback Machine
|
|
Hi Mark,
Definitely possible that various crawlers that have accumulated the web archives over the years have had faults.
Another common cause for this kind of problem, and that may be what's occurring here, is that many of the crawlers stopped downloading files at 1MB, and that's all we have in the collection. Quite a calamity, but that's where we are.
If a zip file ends at 1MB exactly (or within a few bytes: sometimes 1MB includes the HTTP header, for webnerds) then this may be the problem. Generally, there are internal consistancy checks(CRCs) in the zip files themselves, so if your solution of adding a null byte to the end of the file seems to get you a usable zip file, then that's great! Please let me know if this is the case, by responding here, as we may be able to auto-detect and fix this problem as we serve the files, in the future.
Thanks for posting your solution.
Brad |
|