Already facing a legal challenge for alleged copyright infringement, Google Inc.’s crusade to build a massive digital library is encountering stiff competition from an alternative project that promises better online access to the world’s books, art, and historical documents.
The conflict revolves around Google’s insistence on chaining the books it scans to its internet-leading search engine.
An alternative book-scanning effort called the Open Content Alliance (OCA) favors a less restrictive approach to prevent mankind’s accumulated knowledge from being controlled by a single commercial entity, even if it is a company such as Google, which has embraced “Don’t Be Evil” as its creed.
In September, the Boston Library Consortium–a group of 19 research and academic libraries in New England that includes the Boston Public Library, University of Massachusetts, University of Connecticut, Massachusetts Institute of Technology, and Brown University–said it would reject Google’s effort and instead work with the OCA to digitize books among its members’ 34 million volumes whose copyrights have expired.
“You are talking about the fruits of our civilization and culture. You want to keep it open and certainly don’t want any company to enclose it,” said Doron Weber, program director of public understanding of science and technology for the Alfred P. Sloan Foundation.
The New York-based Sloan Foundation last year gave a $1 million grant to the Internet Archive, an OCA leader, to help pay for digital copies of collections owned by the Boston Public Library, the Getty Research Institute, and the Metropolitan Museum of Art.
The works to be scanned with that grant include the personal library of John Adams, America’s second president, and thousands of images from the Metropolitan Museum.
The Boston Library Consortium deal represents a major coup for Internet Archive founder Brewster Kahle, a strident critic of the controls that Google has imposed on its book-scanning initiative.
“They don’t want the books to appear in anyone else’s search engine but their own, which is a little peculiar for a company that says its mission is to make information universally accessible,” Kahle said in an interview with the Associated Press last year.
Google’s restrictions stem in part from its decision to scan copyrighted material without explicit permission. Google wants to ensure that only small excerpts from the copyrighted material appear online–snippets the company believes fall under “fair use” protections of U.S. law.
A group of authors and publishers nevertheless have sued Google for copyright infringement in a two-year-old case that is slowly wending its way through federal court.
In contrast, the OCA will not scan copyrighted content unless it receives the permission of the copyright owner. Most of the books the alliance has scanned so far are works whose copyrights have expired.
Google has not said how many digital copies it has made since announcing its ambitious project three years ago.
The company will only acknowledge that it is scanning more than 3,000 books per day, a rate that translates into more than 1 million annually. Google also is footing a bill expected to exceed $100 million to make the digital copies–a commitment that appeals to many libraries.
It costs the OCA as much as $30 to scan each book, a cost that is borne by the group’s members. Some libraries also have received grants from groups such as the Sloan Foundation to help with the cost.
The non-copyrighted material in Google’s search engine can be downloaded and printed out, a feature that the company believes mirrors the OCA’s goals.
Although the OCA depends on the Internet Archive to host its digital copies, other search engines are being encouraged to index the material, too.
Both Yahoo Inc. and Microsoft Corp., which run the two largest search engines behind Google, belong to the alliance. The group has more than 60 members, consisting mostly of libraries and universities.
None of Google’s contracts prevent participating libraries from making separate scanning arrangements with other organizations, said company spokeswoman Megan Lamb.
“We encourage the digitization of more books by more organizations,” Lamb said. “It’s good for readers, publishers, authors, and libraries.”
The motives behind Google’s own book-scanning initiative are not entirely altruistic. The company wants to stock its search engine with unique material to give people more reasons to visit its web site, the hub of an advertising network that generates billions of dollars each year.
Despite its ongoing support for the OCA, Microsoft last year launched a book-scanning project of its own to compete with Google. Like Google, Microsoft won’t allow its digital copies to be indexed by other search engines.
Earlier this month, Yale University announced that it had joined forces with Microsoft to digitize thousands of books from its library system. Yale’s move sparked controversy in the academic world, with some critics saying it abandoned its principles in an effort to save money. Yale said it would not otherwise be able to afford to scan so many books; Yale has one of the world’s largest university libraries, with 13 million volumes. (The U.S. Library of Congress, with 30 million volumes, is the largest in the U.S., according to the American Library Association.)
Although Kahle says he was disappointed by Microsoft’s recent move, he remains more worried about Google’s book-scanning initiative because it has gathered so much attention and support.
Many of the libraries contributing content to Google so far are part of universities, including Harvard, Stanford, Michigan, Oxford, California, Virginia, Wisconsin-Madison, and Chicago. The New York Public Library also is relying on Google to scan some of its books.
The University of California, which also belongs to the OCA, has no regrets about allowing Google to scan at least 2.5 million of the books in its libraries. “We felt like we could get more from being a partner with Google than by not being a partner,” said university spokeswoman Jennifer Colvin.
But some of the participating libraries might have second thoughts if Google’s system isn’t set up to recognize some of their digital copies, said Gregory Crane, a Tufts University professor who is currently studying the difficulty of accessing some digital content.
For instance, Tufts worries that Google’s optical reader will not recognize some books written in classical Greek. If the same problem were to crop up with a digital book in the OCA, Crane thinks it will be more easily addressed because the group is allowing outside access to the material.
Google “may end up aiming for the lowest common denominator and not be able to do anything really deep” with the digital books, Crane said.