Tag Archives: Internet Archive

Obama’s Change.gov promise to protect whistleblowers? Scrubbed from the Web

Well, this pissed me off. Long-time readers of this site may recall my interest in the Internet Archive’s Wayback Machine, which aims to preserve the historical web. I’ve previously written to criticize the Bush administration for its lengthy robots.txt exclusion file (thousands of lines long), which could be viewed as an attempt to prevent the […]

Read More

Major expansion of Wayback Machine’s archive of the historical internet

The Next Web reports that the Internet Archive has vastly increased its historical database of the web: The Internet Archive has updated its Wayback Machine with a significant bump in coverage: the service has gone from 150,000,000,000 URLs to having 240,000,000,000 URLs, a total of about 5 petabytes of data. More specifically, the Wayback Machine […]

Read More

A presidential “legacy” via rewritten history

Web archiving is a topic of great interest to me and the subject of an article I’m writing.  Part of the paper addresses the Bush administration’s questionable conduct regarding the content of the White house website.  For example, the White House website’s robots exclusion file — a mechanism that can be used to ask search […]

Read More

Is Zoetrope the next-gen Internet Archive?

Although the Internet Archive’s Wayback Machine is a great research tool, its utility is hampered but a lack of basic search mechanisms.  One can search by URL and archived links, but basic Google-style boolean searching isn’t available.  The Archive once offered a beta boolean search tool, but it never worked and it was later withdrawn. […]

Read More