As big data becomes more sophisticated, the question about whether analytics tools -- like those developed by Google -- could someday replace data compiled by federal agencies.
A case in point is the Google Flu Trends website, which could someday become a suitable replacement for much of the work now performed by the Center for Disease Control (CDC).
That day, however, is not likely to come for a least a few years, according to a recent opinion article in the National Journal, although many agree there is great potential in such technology if it can be honed.
A comparison of Google's flu trend graph and the CDC's data shows a discrepancy in findings:
While Google shows the current flu outbreak as being the worst of the past six years, the Center for Disease Control shows that the current outbreak is bad, but not as bad as outbreaks of at least two years past. Furthermore, the CDC graph shows that the outbreak is already on the decline. The discrepancy can be accounted for if one looks at the simplistic nature of Google's data, according to the article.
The CDC uses a combination of data: reports of sickness across many disease control centers along with tweaks made by public health experts with years of experience. Google's data is based off of search results that do their best to filter out search noise, but with limited success. The things that's missing, according to the article? The human factor. And Google admits the tool is still in an early phase of development; it has a ways to go before it can compete with human data analysis.
“We intend to update our model each year with the latest sentinel provider [influenza-like illness] data, obtaining a better fit and adjusting as online health-seeking behavior evolves over time,” Matt Mohebbi, a Google software engineer recently wrote for Forbes. “With respect to the current flu season, it’s still too early to tell how the model is performing.”
But in the future, we could see a combination of the two, said Lynnette Brammer, a flu epidemiologist with the CDC. While there may never be a substitute for human decision-making, technology could save public heath workers a lot of time, she said. “We want the data transmission to be as easy for the people providing it to us as possible," she said. "But the thing we don’t want is to lose the connection we have with those people. Even if you have really good data coming in, you’re always going to have questions about what it means."
When comparing the two systems, one primarily run by people and the other by a machine, it comes down to understanding complexity. “It’s really hard, certainly for us at CDC, to understand what’s causing that change,” Brammer said. “They’re seeing pretty much record levels of influenza-like illness. And while ours are high, they’re not at historical limits by any means. We just have a lot more flexibility and ability to track down and ask additional questions and find the answers to those questions.”