In 2015, as the mosquito-borne virus Zika quickly spread through the Americas, travel bans and quarantines were issued, as well as calls to cancel the 2016 Olympics in Brazil. As the World Health Organization declared an international public health emergency, governments in affected countries needed a way to accurately predict the rates and locations of new infections. Because only 20 percent of Zika cases are symptomatic, it is a particularly challenging virus to predict.
In January 2016, the team at Northeastern University’s MoBS lab, with the support of the Center for Inference and Dynamics of Infectious Diseases, started the Zika Modeling Project to help public authorities and researchers better understand its evolution and spread.
'With the use of big data and massive computing power, we hope to help researchers and public health officials.'Matteo Chinazzi, Associate Research Scientist, Northeastern University
GCP: Providing essential prediction tools, analytic tools and more
Using a mathematical and computational approach powered by Google Cloud Platform (GCP), the team has studied different scenarios under which Zika could spread, projecting its impact on affected populations. The model is based on the initial spread of Zika in Brazil, where the virus broke out in 2015. The researchers are now able to predict the impact of new infections in other locations by introducing additional data layers, including temperature, number of mosquitoes, population size and people’s travel patterns.
GCP allows the team to run several parallel simulations, and to analyze the terabytes of data generated by the scenarios modelled. “We use several GCP products,” says Matteo Chinazzi, Associate Research Scientist at Northeastern University. “Google Cloud Storage stores all of our modeling data as well as hosts the website. Google Compute Engine (GCE) and Preemptible Virtual Machines run the simulations of the disease’s spread. Google BigQuery examines the simulated scenarios, each of which involve variables, such as dates and infection numbers. So far, we’ve churned through a tremendous amount of data—hundreds of terabytes in all. Google Cloud Storage stores all of it.”
Getting results to move quickly at scale
With GCE and Preemptible Virtual Machines, MoBS has run more than 10 million simulations. GCE and BigQuery have drastically reduced the time needed to perform simulations and analyze data. (Both processes now take hours, rather than weeks.) “We have the flexibility to scale up to several thousand independent virtual instances in parallel,” he says, “so we can generate a full analysis for a single epidemic scenario—which may consist of up to 250,000 independent simulations—in less than a day.”
In addition to enabling researchers to understand the spread of Zika, this model may become a template for analyzing other epidemics, such as dengue. Although Zika is no longer an international emergency as declared by the World Health Organization, there is still work to be done in preventing outbreaks of mosquito-borne diseases. With the use of big data and limitless massive computing power, the team at MoBS hopes to help researchers and public health officials achieve that.
“Time is vital when confronting disease outbreaks,” says Chinazzi, “and GCP gives us the tools we need to move quickly at scale.”
To read more about the Zika research and analysis conducted by MoBS Lab, discover “Spread of Zika virus in the Americas” published by Proceedings of the National Academy of Sciences of the United States of America.
'We have the flexibility to scale up to several thousand independent virtual instances in parallel, so we can generate a full analysis for a single epidemic scenario – which may consist of up to 250,000 independent simulations – in less than a day.'Matteo Chinazzi, Associate Research Scientist, Northeastern University