|
|
|
| Home | Donate | Forums | FAQs | Contributions | Terms, Privacy, & Copyright | Contact | Jobs | Bios |
| Anonymous User (login or join us) | Upload |
The Internet Archive seeks a Senior Engineer to lead technical program definition and implementation of IA's web wide harvesting program.
Come help us archive the Internet and preserve it for future generations. You'll be responsible for driving our efforts to retrieve the highest quality content from the Web and for building the largest and most useful Web archive in existence.
Our Web Wide Harvesting program goals are: 1) to grow and augment the 150+ billion Web captures contained within IA's historic Web archive, using open source tools and platforms, through ongoing, regular captures of Web content of interest, 2) to analyze our Web collections to ensure we are harvesting a representative sample of what's happening on the web in each calendar year, and 3) to experiment with harvest techniques and tools that enable the archival capture and re-rendering of rich media, streaming content, social media, and so forth, in addition to traditional web page content, as publishing practices and interactive online mediums evolve.
In this role you will work with the Web Collections Manager to design the strategy and implementation of the program and lead the operation of ongoing Web-scale harvests. You will also assist the program by creating tools and services as needed to improve the crawl (this may include analysis, reporting, QA, data import, etc...). You will
Your responsibilities will include:
Experience Needed:
Education: Computer Science, Math BS/BA or equivalent work experience
The Internet Archive, based in San Francisco's Presidio, is an entrepreneurial and technologically-innovative nonprofit that serves as a public repository for born digital and digitized materials. IA works closely with libraries, archives, museums, and educational institutions from around the globe to promote web archiving best practices and to ensure collections include culturally significant and relevant materials. IA makes all data freely and publicly accessible from www.archive.org. Find out more about our organization and web archive at www.archive.org.
We are an equal opportunity employer. Please send your resume and cover letter to jobs at archive dot org with the subject line "WWW Crawl Tech Lead". The Archive thanks all applicants for their interest, but advises that only those selected for an interview will be contacted. No phone calls please
The Internet Archive is seeking a Crawl Engineer to run large-scale crawls for our partners. Our crawl engineering team is responsible for capturing and managing the highest quality content from the web for our 90+ library and archive partners around the world. An ideal candidate demonstrates independence and initiative, is a problem solver, works well autonomously, and is technologically savvy. Additionally, the ideal candidate is open to being trained on best practices and standards around large-scale web harvests.
Find out more about our organization and web archiving at www.archive.org as well as our tools and services at http://wa.archive.org/
Your responsibilities include:
Experience Needed:
Education:
Computer Science, Math BS/BA or equivalent work experience
We are an equal opportunity employer. Please send your resume and cover letter to kristine at archive dot org with the subject line "Crawl Engineer". The Archive thanks all applicants for their interest, but advises that only those selected for an interview will be contacted. No phone calls please.