This week, Google made good on that promise.
The search engine giant said Tuesday that it was working with five major research and academic libraries to digitally scan books and make them searchable online.
The agreements -- which range from pilot programs to a plan to scan up to 7 million books owned by the University of Michigan -- are an extension of Google's previously announced Google Print test program, which makes printed material searchable online and makes it easier for users to buy books that pop up in search results.
But with Google's extension of the program Tuesday to include out-of-print, copyright-expired books more than 100 years old, the company spotlights the ongoing tension alluded to in its IPO documents: A laudable corporate policy of providing what is unquestionably a public good may not necessarily be the best use of capital as far as Google shareholders are concerned.
Or, as Google co-founder Larry Page wrote in an open letter to Google's potential shareholders, "Our goal is to develop services that significantly improve the lives of as many people as possible. In pursuing this goal, we may do things that we believe have a positive impact on the world, even if the near-term financial returns are not obvious."
"They want to be the source for all information everywhere," says Mark Mahaney of American Technology Research. "This is fitting right within the strategy and mission statement of the company. If you're long Google shares, you should not be surprised by this." Mahaney, whose firm hasn't done underwriting for Google, has a buy rating on the company and a price target of $210.
Google's shares rose $3.35 Tuesday to trade at $173.80.
On a public-service yardstick, it is hard to argue with Google's plans to scan millions of books in the possession of the University of Michigan, Harvard, Stanford, Oxford and the New York Public Library.
For example, books in Oxford's collection can reach an audience not hampered by the usual limitations of age, academic credentials or ability to travel to England.
And yet -- though the Google Print program paints a clear path for Google to make money by connecting book buyers and booksellers -- it's hard to imagine that the monetization opportunities embedded in 19th-century obscurities is as great as that as for, say,
The Da Vinci Code
, a 21st-century thriller about unearthing ancient obscurities.
Susan Wojcicki, director of product management at Google, says the main reason for the libraries initiative is to add to the index of searchable information on Google. "We believe having a more comprehensive index will lead to a better search experience, which will lead to more searches," she says. "Right now, the amount of documents we have is small, so you will not see the benefits today. But the goal is long-term -- adding these books will increase the comprehensiveness."
Assessing the financial impact of Tuesday's announcement on Google is also difficult because of the lack of hard numbers Google has supplied regarding the program with the libraries.
One assumes that Google will be assuming most, if not all, of the costs of scanning material in. And one also assumes that scanning books for digitization is a labor-intensive process -- one that requires a person somewhere to leaf through a book, page by page, to get it from the printed paper into a digitized form.
"Developing the technology to scan large libraries is something consistent with our strengths and interests," says Wojcicki. Asked whether the labor-intensive process of scanning in pages is truly scalable, Wojcicki says there are costs other than accessing books, such as storing information and serving it. Google already has experience in those technologies, she says.
But with Google silent about over how many years the project will take place, it's hard to know how great the impact will be on Google's bottom line.
Mahaney says both the costs and the payoffs of Tuesday's announcement remain unclear. If it's a long-term project -- say, extending over a decade -- "that's probably a pretty modest expense," he says. As for the demand for material that might be added, he says he goes by the general rule that 80% of searches are for 20% of online content. "Obviously, if you're doing stuff pre-copyright," Mahaney says, "you're dealing with a pretty small amount of information for where there's ... demand."
Says Wojcicki, "It's hard for us to know how any one document or how any one book will be used. ... We don't think about how each is used. We think about, 'How comprehensive is the index?'"
Adds Wojcicki, "We put a lot of energy into search algorithms, but if the content isn't even in the index, the best search algorithm in the world will not be able to locate that information."
Asked to discuss at what point costs diminish the returns of adding new material to the index, Wojcicki says, "We've found that having more information is always better."