Turning to Twitter
Social media can provide some unexpected clues to help predict COVID-19 outbreaks.
By Ramona Dubose
NC State researchers in the Poole College of Management are mining data from Twitter — using keywords that correspond to symptoms — to predict COVID-19 out-breaks, and have teamed up with the Clinton Health Access Initiative to predict hotspots in sub-Saharan Africa.
The results in the U.S. have been accurate more often than 24 other models, most of which rely on reports of prior cases, deaths and additional medical surveillance and survey results. The predictions are collected and reported by the CDC. Associate Professor William Rand, executive director of the college’s Business Analytics Initiative, is leading the team that is building and using models that rely on keywords in tweets, analyzing how often people mention two or more symptoms of COVID-19 to predict future COVID-19 outbreaks.
The tweets must list at least two words that the World Health Organization has identified as COVID-19 symptoms, such as fever, headache or chill. To improve the search, they included partial spellings and combinations of words, such as “los” AND “taste”, “los” AND “smell,” picking up phrases like “lost ability to taste,” and “losing my sense of smell.”
Rand and his team used tweets from all 50 states for their predictions, and in comparisons with 24 other models developed at other institutions, NC State’s models most closely predicted outcomes more than 30% of the time. In comparison, the next most accurate models — which included those from Johns Hopkins and UCLA — were correct only about 12 percent of the time.
Two of Rand’s students, Aidan McCarthy and Trevor Ferree ’20, came up with the models at the outset of the pandemic, and they’re optimistic about the potential. “We’ve seen that social media data can provide results as good or better than conventional methods on a limited basis,” McCarthy says. He cautions that the model has not been tested widely, and most of the testing occurred when tweets about COVID-19 symptoms were common.
The partnership with the Clinton Health Access Initiative (CHAI), combines results from Twitter data with information gleaned from LexisNexis (news reports) and PubMed (biomedical and life sciences journal articles) to better understand needs. By “listening” to people’s concerns, organizations like CHAI can “make time-sensitive resource allocation decisions,’’ says Paul Domanico, senior director of global health sciences at CHAI, “as well as monitor the mental health of nurses and doctors participating in the COVID-19 response.” To monitor mental health concerns, NC State researchers analyzed tweets from people in Kenya who self-identified as health care providers and used key words such as “depression,” or “depressed.”
Until the pandemic, Rand’s research focused on business applications for his models. But the collaboration with CHAI shows that analyzing big data could play a role in improving public health. “If we can apply these analyses to spreading public health knowledge and best practices,” he says, “it could be transformational.”