Monday, November 17, 2008

Google Flu Data Rivals CDC

Last week, internet search giant Google released the Google Flu Trends tool, which tracks possible flu outbreaks by compiling data on how often people use flu-related search terms such as "flu symptoms" and "chest congestion". Google Flu Trends provides raw data, but not context. Each year, close to 100 million Americans search for health information online, but not everyone who searches for health information is injured or ill. Since I write about medicine, for example, I usually search for health information online for my writing projects, not personal knowledge. Other searches, it seems, might be done out of idle curiosity, or even result from a keystroke error in the search bar.

Is Google Flu Trends just another odd little Google project that their employees tinker with at the Googleplex in Mountain View in between running the search engine and scanning in every book ever written? Apparently not. Google mapped five years' worth of their flu data against flu data from the Centers for Disease Control and Prevention (CDC), which the agency compiles from health care providers, emergency room visit statistics, and other sources. Data from Google correlated closely with CDC data, often predicting flu outbreaks a week or two before the CDC. Google will soon publish a paper on its methodology in an upcoming issue of Nature.

Health care researchers already search for trends in anonymized electronic medical records (EMRs) that some practices use to record patient medical data and prescriptions. More digital data will become available in the future as EMRs become more common (especially since the federal government is providing financial incentives for Medicare providers to adopt e-prescribing, starting in 2009) .

For various reasons, however, many patients are not entirely honest with their doctors about their symptoms and medical concerns, a problem health care providers have struggled with for years. A patient might be embarrassed about a medical problem, forget to mention a symptom, or simply not realize that a symptom is significant. For this reason, search engine data might provide an even larger, and potentially more accurate, data pool than EMRs to indicate the actual incidence of conditions such as pre-diabetes or early heart disease, for example. Public health officials could then use the data to create more effective screening and prevention campaigns.