For much of its history, Google has been a widely admired company that could seemingly do no wrong. But in recent years, some observers have cast a suspicious eye at the search giant. From censoring content in China to accusations of invading user privacy at the behest of the U.S. government, the company with the motto "Do no evil" has lost some of its luster.
If its image has been tarnished, much of the blame stems from Google's ability to intimately track users' Web browsing habits. Though Google is by far the most popular site for searching the Web, users are growing more uncomfortable with the notion they may be under the lens of Google's microscope.
But what if Google could use its considerable power for good? The company will tell you that's what it's always done. If you want proof, look no further than Flu Trends, a remarkably simple service Google devised to help the nation's health officials get an upper hand during flu season.
If advertisers can determine your shopping trends based on Web searches, health officials should be able to monitor health trends the same way. That's the underlying, albeit simplified, rationale behind Detecting influenza epidemics using search engine query data, a paper that appeared in the November 2008 issue of Nature. The authors - Jeremy Ginsberg, Matthew H. Mohebbi, Rajan S. Patel, Mark S. Smolinski and Larry Brilliant of Google and Lynnette Brammer of the Centers for Disease Control and Prevention (CDC) - analyzed years of search terms and concluded they could develop a model to quickly identify influenza outbreaks.
"By processing hundreds of billions of individual searches from five years of Google Web search logs, our system generates more comprehensive models for use in influenza surveillance, with regional and state-level estimates of ILI (influenza-like illness) activity in the United States," they wrote.
The authors gathered historical logs of Google search queries from 2003 to 2008. From that data they developed a formula to track the occurrence of common search queries amid the 50 million most common searches in the U.S. during that time. The formula was then further refined to narrow the query tracking to ILI-related searches. The resulting search trends were then compared to the data gathered by the CDC across its nine public health regions. The CDC's influenza-surveillance data is gathered by 1,500 doctors who report to the CDC on 16 million annual physician visits concerning ILI - a process that can take several weeks. It turned out that the researchers' Web query analysis produced trends similar to those discovered by the CDC.
"Google Web search queries can be used to estimate ILI percentages accurately in each of the nine public health regions of the United States," according to the authors. "Because search queries can be processed quickly, the resulting ILI estimates were consistently one to two weeks ahead of CDC ILI surveillance reports. The early detection provided by this approach may become an important line of defense against future influenza epidemics in the United States, and perhaps eventually in international settings."
The authors are quick to note, however, that their model is not intended to replace the sort of on-the-ground surveillance conducted by the CDC. Instead, Google Flu Trends is designed to help public health officials spot an outbreak before it starts. "This system is not designed to be a replacement for traditional surveillance networks or supplant the need for laboratory-based diagnoses and surveillance. Notable increases in ILI-related search activity may indicate a need for public health inquiry to identify the pathogen or pathogens involved. Demographic data, often provided by traditional surveillance, cannot be obtained using search queries," the authors said.
"In the event that a deadly strain of influenza emerges, accurate and early detection of ILI percentages may enable
public health officials to mount a more effective early response."
They also point out that, during the process of tracking queries, no personal information is recorded, nor are user IP addresses or users' specific physical locations.
You can see just how accurate the gathered data is at www.google.org/flutrends. With the formulas in place, Google engineers can show flu trends just as easily as they show webmasters their sites' analytics. When the data is charted, the results are strikingly similar to those found by the CDC's surveillance system. In fact, from 2004 through 2008, the flu activity reported by Google and the CDC are almost identical. The Google numbers skew slightly higher, but that can be attributed to people searching the Web for flu information when they don't actually have the flu.
So what search terms give hints there may be a flu outbreak on the way? According to Google spokeswoman Katy Bacon, it could be something as mundane as "thermometer." When taken together, these search terms can give vital, advance notice to health officials.
"Maybe you're [searching] for where you can buy a thermometer or what the best chest congestion remedy is, or things like that," she explained. "By tracking the popularity of certain Web search queries, we can accurately estimate the level of flu in each state in near real time. The reason this is important is early detection is critical to helping health officials respond quickly. That's why the CDC tracks the disease. But Flu Trends can help inform the public and officials about flu levels one or two weeks before the traditional surveillance system."
With Flu Trends helping to inform the public about influenza, the obvious question is whether these sorts of analytics can be applied to fight other outbreaks.
"We have a product called Google Trends that lets you track the popularity of specific search queries," Bacon said. "I know the team is excited about where they can go next. But for right now they're just focused on making sure Flu Trends continues to work."
The team of researchers who gathered the data for Flu Trends wants to expand the capability to regions with inadequate medical care. They believe the tool can be particularly useful in developing nations.
"We hope to extend this system to enhance global influenza surveillance, especially in areas that currently lack the necessary resources, including laboratory diagnostic capacity." One problem, of course, is that many areas that could most benefit from this data are those that have limited Internet access. But as Internet access continues to spread, Google is hoping Flu Trends will help ensure the flu doesn't.