already dominates the booming search market.
And it figures to extend that lead by exploiting a data advantage that allows it to deliver superior search results and attract even more users.
Having an increasing share of searches run on its search engine is an important, and often overlooked, advantage it has over competitors such as
This advantage creates a self-reinforcing cycle that draws even more users to Google.
Google's data advantage, which derives from its large share of user searches, gives the Web giant superior insight into user behavior and intent, which results in superior search results.
This, in turn, attracts more users to the company's search engine, thereby providing it with even more data. The implications of this phenomenon -- the subject of much debate in the search community -- are profound for investors.
Of the 61 billion searches conducted worldwide in August, 37 billion were conducted on Google's search engine, according to researcher ComScore.
"Because they have more data, Google can train their algorithms (the mathematical formulas computers run to come up with the results to search queries, among other things) better than any of their competitors," says Ani Kortikar, CEO of search marketing firm Netramind.
"And the more data you have to train your algorithms on, the better your predicative capabilities will become."
The role of something as mundane as data often takes a backseat to Google's other high-profile attributes: the reputed brilliance of its workforce, the power of its brand (a recent survey indicated it was the most powerful in the world) and the vast supercomputer it has managed to cobble together.
A Monopoly on the Search Market
But Google's data advantage is what has some analysts forecasting that the company could eventually monopolize the lucrative search market, which eMarketer predicts will total $42 billion by 2011.
"Google employs various algorithms to match search queries to relevant page links, including both a ranking and text-matching technology. The larger the database of past queries performed by users, the more accurate and timely the delivered search results," Credit Suisse analyst Heath Terry wrote in a research note in November.
"This virtuous cycle of Google attracting more users due to better search results leads to even better responses and more users. We believe that Google will continue to grow their share of the online search market until they have virtually 100% share." Credit Suisse makes a market in Google shares.
Indeed, many investors wrongly believe that Google's search-market dominance derives exclusively from its well-known search technology, its powerful infrastructure and the force of consumer habit.
Compared with companies such as
, "Google is much more vulnerable to competitive entry, because the technology is well-known, server farms are becoming less and less expensive, and users don't become more likely to use Google based on what other users do," John Hussman, president of the Hussman Investment Trust, wrote in a note to clients in October to explain why he doesn't own Google shares.
"Little prevents competitors from gradually sniping market share except the slight neuromotor conditioning created by repeatedly typing the company name."
In fact, far from seeing market share sniped away, Google has only gotten more popular. According to researcher HitWise, it commanded about 65% of the search market in October, up from about 62% a year ago.
Yahoo!, Microsoft, and
Ask.com, with a 21%, 7%, and 5% share respectively, all saw their market share slip over the same period.
When it comes to the quality of search results, the power of Google's -- or Microsoft and Yahoo!'s, for that matter -- algorithms is only part of the story. The amount of data the algorithms have to feed off of also plays a critical role in the relevance of search results. Algorithms are fundamentally rooted in statistics and probability, where data plays a crucial role in predicting the quality of future results.
"Algorithms are like saw mills, and saw mills require lumber," says Steve Arnold, of Arnold IT, who consults for a bulge bracket Wall Street investment bank about Google. "Data is the lumber, and the more data you have, the more efficiently you can run the saw mill."
And when it comes to search, data is one of the behind-the-scenes reasons for Google's domination of the search market.
The Secret to Better Search Results
Google's data advantage breaks down along two lines.The first is in its ability to crawl the Web and index Web sites. Here, the company's clever computer formulas, massive investment in technology infrastructure and well-documented use of distributed computing -- computer processing where different parts of a program can be run on many computers at the same time -- give it a vital edge.
"Microsoft and Yahoo! have tentacles crawling the Web too, but Google has the longest tentacles of them all," says Riza Berkan, CEO of search engine Hakia.
But how Google is able to deliver better search results as more queries are run on its search engine is key to understanding why data could allow it to extend its dominance.
For starters, Google can create better "information taxonomies" -- or groups of topics -- than other search engines, says Kortikar.
So while Google and a rival search engine may both receive queries for "acne", Google is better able to decipher whether the searcher is a teenager who wants a quick solution or a doctor browsing medical information. This is because Google has more data on prior and related groups of searches by users with similar profiles.
"As you have more data, you are able to move from just the keyword to what the true intent of the search was because you have more context, and that gives you better search results," says Kortikar.
Google's systems are also trained to respond to incoming search queries in real time and shift its computing resources on the fly to match emerging hotspots of requests on the Web. Any news event that causes a spike in query traffic automatically sends more of Google's processing in the direction of the incoming requests, says Arnold.
While Microsoft and Yahoo! also have similar, albeit less powerful, systems in place, Goggle's ability to deliver fast, relevant and powerful search results is further boosted by the large query volumes coming in, Arnold says.
Its tripwire is set off quicker than that of its competitors thanks to the vast amount of data it sees. "Microsoft and Yahoo! are running Corvettes around the track, and they're pretty happy about that," says Arnold. "But Google has the Ferrari."
A growing share of searches, users and data also give the company more insight into "dwell time" -- or the amount of time users spend wandering over different Web pages off Google search results -- than competitors, Arnold points out. That helps the company determine which pages are of more interest to certain types of users, thereby increasing the quality of search results.
Finally -- and somewhat counterintuitively -- Google's push towards increasingly personalized search will be a big beneficiary of the company's growing reams of large-scale aggregate data, says Kortikar. Individual users will benefit because personalized search attempts will lead to customized, rather than generic, results.
Personalized search is a part of major Google products like iGoogle, which was launched in 2006 and has seen among the fastest adoption rate of any Google product, Google vice president of search products Marissa Mayer told
in a prior interview.
It was also the centerpiece of a move announced by the company this fall that would take consecutive user queries in the same browser window into account when determining subsequent search results, rather than treat each query as if it were independent.
Fine Tuning the Search Results
While honing in on each user's preferences is the aim of personalized search, the enormous statistical power that comes from having such a large data set allows Google to fine-tune its results to such a great extent, says Kortikar.
Google can cross-reference each user's query patterns and result preferences with those of similar profiles. That allows it to serve up future results that each user finds more relevant to her individual needs. Still, while the appearance is one of narrow personal attention, it is actually the vast amount of large-scale data that allows Google to be so precise.
The same goes for the style and presentation of search results, says Arnold. When Google understands that a user is likely a medical professional instead of an enthusiast, the search engine can skew result pages in the direction of more serious medical journals instead of blogs, for example.
The entire process is automated: Google guesses the type of the user by statistically corroborating behavior among different population groups, and delivers results based on user preferences indicated by the data.
Is More Data Better?
Not everyone agrees about the importance of Google's data edge. While conceding that more data is usually better, Doug Leeds, senior vice president of products at Ask.com, maintains that its importance tends to diminish dramatically beyond a certain threshold. And all the major search players are already at the point where they have crossed that threshold, Leeds believes.
"More data does ultimately give you better results, but it doesn't help that much once you get beyond that point of statistical significance," Leeds says. "Once you reach that level, additional data mainly just reinforces the results you would have reached anyway."
Instead, Leeds says Ask is focused on using its algorithm to create search results that are distinct from Google's and appealing in different types of ways.
Still, data will continue to play an increasingly vital role as the personalization of search gets into full swing. And that's likely to get a big boost from the mobile search market, which is just getting started.
Saddled with loads of information about each user and the ability to draw on new types of searches like those based on location, mobile devices will likely put a host of new demands on data sets that were previously thought to be more than adequate.
The true might of Google's data, in other words, may yet to be seen.