Tweets dish up how states eat, exercise

A map showing the caloric balance by state as reported on Twitter.
A map showing the caloric balance by state as reported on Twitter.

What does Twitter say about your diet? According to an analysis of 50 million tweets, Mississippi loves cake, Virginia can’t get enough bacon, and Colorado likes chocolate bars. And how do we burn off those calories? Judging by those same tweets, Colorado runs a lot, Virginia swims a bit and Mississippi likes to dance.

A new online, interactive instrument built by researchers at the University of Vermont is using Twitter to count how many calories Americans consume and expend. Their tool, dubbed the Lexicocalorimeter, looks through tweets for food and exercise-related words, like doughnut and treadmill, and runs them through a basic algorithm that ranks the words by their frequency and caloric implications.

The algorithm then uses a simple ratio of calories consumed to calories burned to calculate each state’s caloric balance. On the basis of tweets made from the continental states during 2011 and 2012, the researchers found that Mississippi expended the fewest calories, with Colorado burning the most.

(Arkansas fared little better than Mississippi, ranking just above that state and Louisiana overall, with chocolate candy, butter, sunflower seeds, doughnuts and cheese dip at the top of the Twitter food count and with running, eating, dancing and walking as the top ways Arkansans expend calories.)

“In many of the states where obesity rates are the highest, the calories being consumed is a lot higher than the calories being burned,” says Chris Danforth, an applied mathematician and assistant professor at the University of Vermont. But while the algorithm’s results are in line with public health data, Danforth acknowledges that Twitter represents a limited sample size — so the Lexicocalorimeter is no replacement for public health surveillance.

“We certainly don’t know how long they’re running or how many hot dogs they’re eating, but from a higher level looking down on Earth you can see what’s going on with people’s health,” Danforth says. He likens the Lexicocalorimeter to early versions of Google Flu Trends, a service from Google that estimates influenza activity based on Google searches of terms like “flu” “cold” and “sick.” Google Flu Trends, while nowhere close to predicting influenza outbreaks, has drawn interest from public health authoritie.

Like Google Flu Trends, the Lexicocalorimeter’s algorithm has been calibrated to eliminate false positives. The word “apple,” for example, can mean more than just the food. “If it’s a food usage, we assign a calorie for it. If it’s the company, we don’t,” explains Danforth, who together with Peter Sheridan Dodds and an interdisciplinary team published an early (and not yet peer reviewed) version of a study explaining the Lexicocalorimeter on the scientific preprint site arXiv.

“Twitter is really useful for learning what people are talking about and what people are doing,” says Mark Dredze, a researcher at Johns Hopkins University in Baltimore. He studies social media and health and was not involved in the study.

“Exploring that is the first stage,” Dredze says. “The second stage is developing better algorithms for the types of questions being asked in public health and determining who in public health will benefit from this information.”

Fine-tuning these algorithms is key to improving large-scale analysis of social media, whether the goal is to measure the caloric content of a tweet or to find the next developing news story. These technologies represent new ways of finding and understanding the conversations we’re having as a country — chatter that is increasingly moving online.

And developing tools like the Lexicocalorimeter is just plain fun. “We can make maps of the U.S. based on how often people talk about rock climbing or eating kale or bacon,” says Danforth. “It’s a way to explore our culture.”

Upcoming Events