Guest Column | July 27, 2020

For Effective Contact Tracing, Epidemiologists Must Embrace Advanced Analytics

By Steve Bennett, Ph.D., SAS

Advanced Healthcare Data Analytics

The year was 1854 and a cholera epidemic was ravaging London. The prevailing theory among physicians was that cholera spread through the air. But London physician John Snow wasn’t so sure. He suspected the disease was transmitted via water.

So, he decided to use data to investigate. Snow went door to door in a particularly hard-hit neighborhood, interviewing victims’ families, counting the cholera deaths in each location, and placing them on a (now very famous) map.

Looking at the data, it was clear that the deaths were roughly localized around a single water pump that the infected patients and their families used.

John Snow tested his theory by removing the handle on the pump, taking it out of service. This simple act effectively stopped the outbreak, proving the connection between the outbreak and the water from the pump.

With that discovery, modern epidemiology was born. Snow was the first to use geolocation and maps to spatially analyze disease transmission. He was the first to interview and compile detailed patient disease records. And he was the first to synthesize that information and take public health action to mitigate or slow the spread of disease.

Modernizing Epidemiology

Today, these same approaches are needed more than ever in the fight against COVID-19. Contact tracing is the modern version of what Snow was doing years ago – interviewing patients and their close contacts, gathering and synthesizing data, and using it to limit further spread.

I spoke with former epidemiologists, (now working in analytics and tech) with experience working to counter infectious disease outbreaks. They all believe, just like John Snow, that data and analytics – particularly advanced analytics like artificial intelligence and machine learning – can play a crucial role in the overall health and policy response to today’s global pandemic.

“COVID-19 has brought the public health side of epidemiology back to the forefront,” said Mark Morreale, a Global Academic Program Manager at SAS. Morreale is a former epidemiologist who worked on initiatives to measure and limit the spread of Ebola, SARS, and other diseases. He feels COVID-19 has revealed the need for new approaches to studying, investigating, and analyzing population health.

“Analytical methods have accelerated at a blistering pace, but traditional epidemiology has been slow to accept them. Epidemiologists need to move beyond inference and causation and embrace new predictive methods like machine learning to help plan and act during a pandemic.”

Transforming Contact Tracing

The need for modernization becomes clear when looking at contact tracing, which operates similarly to how it did a century ago. From smallpox to HIV to SARS, contact tracing has proven effective in slowing disease transmission. But it requires substantial resources and personnel—the CDC estimates that in the U.S. alone, an additional 100,000 contact tracers are needed to be able to identify and quarantine exposed people.

A study from the NIH stated that manual contact tracing on the scale needed for COVID-19 is imperfect and recommends new technology-based approaches to assist in identifying contacts. Since traditional contact tracing is often in-person or over the phone, accuracy can be a challenge. Patients may not remember where they’ve been or with whom they’ve interacted. Or they may hold back information for fear of stigmatization.

“Traditional contact tracing is very personal,” said Pamela Hipp, Sr. Analytical Consultant at SAS. Hipp worked as an epidemiologist on public health initiatives in California, fighting outbreaks of food-borne illness, H1N1, and measles. “People talk about things they might not want to reveal. That’s why the most effective contact tracers are great listeners.”

Compounding the accuracy issue are the unknown elements of COVID-19, such as asymptomatic transmission rates. Ian Kramer is a Sr. Industry Consultant at SAS and former epidemiologist who worked on foodborne diseases, novel H1N1 influenza, and healthcare-associated infections. He thinks that if we had a better understanding of the disease earlier on, our quarantine efforts could have been more targeted. “The lack of data about COVID-19 required us to take broad approaches.”

In other words, every minute counts. So does every fact. Those are areas where analytics shine. Analytics can quickly transform massive amounts of data into a clearer picture that helps healthcare workers and scientists measure and act on fluid situations.

And in addition to accuracy, the scale of the COVID-19 contact tracing challenge is unprecedented for public health agencies. “All that data needs to be managed and reconciled. Analytics has a big role in the sheer volume of contact tracing for COVID-19,” said Hipp.

Understanding Population Movement And Protecting Privacy

Since the pandemic began, technology companies and organizations have focused on modernizing contact tracing by providing health organizations, governments and policymakers access to analytics, data management, and data visualization capabilities.

But alongside augmenting and scaling contact tracing using analytics, these organizations have been applying analytics to give government and public health additional insights. Smartphone location and movement data are supporting public health decision making through an analytics-derived understanding of large-scale population movements, interactive network building, and intelligent alerting.

Yet with great data comes great responsibility, particularly when it comes to smartphone location data, which is why these technologies must include features like masking and de-identification to help ensure data privacy.

“The world is much more data-focused than it was even ten years ago,” says Kramer. “As data sources expand, there will clearly be concern about collecting, connecting, and storing all those data sets. Clear privacy protections and appropriate use cases are a must.”

There are approximately 275 million smartphones in the US. These offer a powerful resource for tracking population movements and coronavirus risks and exposure rates.

But first, the public needs to accept the idea. And it’s a hard sell given that many people are wary of how their personal data is used. A recent Washington Post-University of Maryland poll found that 50 percent of smartphone users said they would not use Apple or Google’s new contact tracing app, and only 43 percent said they trusted the big tech companies with their information.

There have been legitimate reasons to be concerned about how private data is used. But the lack of trust goes deeper than that.

“Decreasing trust in public institutions is going to be a big challenge,” says Hipp. Trust in institutions is waning for myriad reasons. Some of it is driven by conspiracy-thinking and disinformation, others by politics. And for some minority populations, the lack of trust is rooted in feelings of systemic marginalization.

Still, there are some bright spots. According to Pew Research, 43 percent of the public has a great deal of confidence in medical scientists to act in the best interests of the public (an 8 percent increase from before the outbreak). And 79 percent of U.S. adults express a favorable opinion of the CDC.

“As devastating as COVID-19 has been, it’s had an unexpected benefit on the field of epidemiology,” says Hipp. “Think of how epidemiologists have influenced decisions about stay-at-home orders and social-distancing measures and have tempered misinformation and rumor with science and logic. The ripple effects will be greater participation in and support for epidemiology, which will make for more informed and safer communities.”

Moving forward, as the need for data-driven health insights becomes greater, incorporating more analytics at the very large scale – population movement, and at the very local scale - contact tracing – as well as epidemiology as a whole – will be essential for accurately understanding and limiting the spread of infectious diseases.

“To me, analytics and epidemiology are very intertwined, going back to the geocoding of cholera data by John Snow in 1854,” says Kramer. “The foundation of epidemiology is data analytics and data-driven decision making for the public's health.”

Morreale puts it another way: “Epidemiologists are the OG data scientists.”

Ultimately, analytics and technology will support – not supplant – the role of contact tracers and epidemiologists. But as the pandemic continues to surge in new locations and economic uncertainty remains, acting on data-driven insights is more essential than ever.

About The Author

Steve Bennett, Ph.D., is the director of SAS' global public sector practice. He is the former director of the National Biosurveillance Integration Center within the U.S. Department of Homeland Security, where he led the design and application of quantitative analysis to inform key United States security decisions. A scientist by training, Bennett holds a doctorate in Computational Biochemistry from Stanford, as well as undergraduate degrees in Biology and Chemistry from Caltech.