April 9, 2013—At a recent talk, Brewster Kahle, digital librarian and founder of the Internet Archive, held a black box—about the size of a book—aloft, and shared with his audience that it held more than 165,000 books. The four-terabyte hard drive held digitized, scanned, searchable PDFs from the collection at the University of Toronto.
“The idea of building the Library of Alexandria, version two, to have everything ever published, everything that was ever meant for public distribution—books, music, video, lectures, software, web pages—available to anyone who’s curious enough is technologically, legally and economically within our grasp,” he said.
At a talk hosted by Harvard Library Strategic Conversations, Kahle acknowledged that libraries are in the midst of some large-scale shifts. But he challenged his audience to think differently and connect with other institutions—including the Internet Archive—to make knowledge both free and available.
Emphasizing the importance of digitizing and archiving collections, Kahle said that the technology to archive books already exists: it’s simply a matter of having the person-power, dedication and access to digitize them. Scanning a book from beginning to end, Kahle said, takes an hour on average. Pointing out that the Library of Congress holds about 28 million books, Kahle said that a scanned book “requires about a megabyte [of digital storage]. So if you buy seven hard drives, you could hold all the words in the Library of Congress.”
Kahle said that Harvard could begin to archive its holdings and create greater access to knowledge through three steps: digitizing holdings for print-disabled readers; lending digital books across different library and university campuses, such as openlibrary.org; and participating in cooperative scanning—essentially creating a digital scan and swap with other libraries—which results in lower costs.
Kahle also backed a “scan one, get five” system with other libraries and collections: when the Harvard Library scans one book and makes it available to readers online, it would then get access to five books scanned and made available online by other libraries.
Kahle’s call to make all knowledge available is not limited to books: the Internet Archive also includes audio, such as its collection of Grateful Dead concerts. Archiving moving images, he said, helped millennials connect with previous generations through the television, advertisements, newsreels, etc. from previous eras. Kahle’s expansive vision includes the Internet itself, archiving web pages, which he said change every 100 days on average.
Kahle’s message included a plea for assistance, noting the value and importance of Harvard’s extensive collections. “Harvard’s position is tremendous,” he said. “Harvard has always been at the forefront of many organizations and structures. We can help digitally archive your collection. We can make those steps forward.”