Google Taps Searches to Track the Flu

November 13, 2008

Google recently turned its analytic eye on how to track the outbreak of infectious diseases using its own search technology.

Google recently turned its analytic eye on how to track the outbreak of infectious diseases using its own search technology. First up, influenza, which is responsible for half a million deaths around the globe each year.

Google found that certain aggregated search queries tend to be very common during flu season each year. It compared these aggregated queries against data provided by the US Centers for Disease Control and Prevention (CDC), and found that there's a very close relationship between the frequency of these search queries and the number of people who are experiencing flu-like symptoms each week. As a result, if Google tallies each day's flu-related search queries, it can estimate how many people have a flu-like illness. Based on this discovery, it launched Google Flu Trends, where end users can find up-to-date influenza-related activity estimates for each of the 50 states in the U.S.

Google software engineers Jeremy Ginsberg and Matt Mohebbi write, "The CDC does a great job of surveying real doctors and patients to accurately track the flu, so why bother with estimates from aggregated search queries? It turns out that traditional flu surveillance systems take 1-2 weeks to collect and release surveillance data, but Google search queries can be automatically counted very quickly. By making our flu estimates available each day, Google Flu Trends may provide an early-warning system for outbreaks of influenza."

For epidemiologists, this is an interesting development, because early detection of a disease outbreak can reduce the number of people affected. If a new strain of influenza virus emerges under certain conditions, a pandemic could emerge and cause millions of deaths. Google's up-to-date influenza estimates may enable public health officials and health professionals to better respond to seasonal epidemics and pandemics.

Google's preliminary results had a strong correlation with real CDC surveillance data for the 2007-2008 flue season, though Google calls its system "experimental."

Ginsberg and Mohebbi continue, "We couldn't have created such good models without aggregating hundreds of billions of individual searches going back to 2003. Of course, we're keenly aware of the trust that users place in us and of our responsibility to protect their privacy. Flu Trends can never be used to identify individual users because we rely on anonymized, aggregated counts of how often certain search queries occur each week. The patterns we observe in the data are only meaningful across large populations of Google search users."

Google suggests that those who would like to avoid getting the flu should receive a vaccination.