As the coronavirus pandemic nears its fourth week of transforming American society into something that would have been unrecognizable at the start of the year, there are many pressing questions on our minds. Chief among them: How long will this go on? How bad will it get? How widespread is the disease in my community? Are people doing enough to manage the spread of COVID-19?
Unfortunately, there aren’t a lot of answers to those questions right now, but we want them anyway. To that end, there are tons of data-based projections and visualizations that attempt to provide some insight. Some of them are useful, but we’re also seeing a tendency to read too much into them, and to make sweeping generalizations that lack the proper context and fail to account for inherent biases.
For example, the New York Times published a story Thursday morning that included a map based on cellphone tower data purporting to track Americans’ recent travel. Specifically, the map showed where in the country average travel per day had fallen below two miles—and where it had not. In a since-deleted tweet, Michael Barbaro, host of the paper’s mega-popular podcast The Daily, summarized the problem as: “In a word … The South.”
A lot of misinterpretation of my (not well-enough contextualized) tweet on this so taking it down.
Will let the map speak for itself, with link to article for greater context:https://t.co/Ak78wdp2um pic.twitter.com/ZsHLyPaJlh
— Michael Barbaro (@mikiebarb) April 2, 2020
People were quick to respond to Barbaro with additional context. For example, if you overlay that map with one of food deserts where supermarkets are scarce, it suggests that these aren’t all leisure trips. But just contrasting maps doesn’t tell the full story, either. Many of the states that show up heavily in red are states where stay-at-home orders were either issued late, or not at all. On the other hand, as Barbaro pointed out, there are food deserts in parts of the country that nonetheless experienced less travel. So what’s really going on?
The answer, mostly, is that we just don’t know. That’s not a satisfying answer, but it’s the most appropriate one when drawing conclusions from broad swaths of information like “where did people take more two-mile trips and where did they take less.” There are parts of Texas where it takes you two miles just to get off your own property and onto a road; there are others where pastors are suing to be exempted from local orders that would prevent them from holding large church services as part of social distancing measures. There are people whose travel can’t be limited to two miles—in some parts of the country, hospital staff may be a few blocks from their places of work, while in others, they commute back to the suburbs at the end of their shifts.
Yet this is the information people want right now, so there are constant attempts to provide it. Unacast, a Norwegian data insight company, created a map using similar cellphone tower data to offer a “social distancing scoreboard” that looks at the nation at the county level to see where people have reduced the number of trips they make, and by how much. The company’s maps were popular among reporters looking for data to bring to their readers under tantalizing headlines like “These states are nailing social distancing” and “Smartphone data reveal which Americans are social distancing (and not).” But most of these stories dodge the nuances. If someone drives from one county to another, whose score goes up? If an area has a large percentage of people in essential roles, are they doing social distancing poorly? Is the company confident that the F score it gives to a place like Hudspeth County is proof that the 4,408 people who live there are doing a poor job of cutting down their travel? Or does it factor in that it’s hard to get a cellphone signal in a part of Texas that’s bigger than Rhode Island and Delaware put together? In short, does a map based on cellphone data account for regional disparities?
When I asked a Unacast representative for details about how the map calculated trips that began in one county and ended in another, she said, “I don’t have any insights into that,” and didn’t respond to requests for a phone interview. To be fair, there are some general impressions one can glean from their data. Comparing similar counties might help us figure out who’s doing more social isolation. But a model is only as good as the data that goes into it. And if Unacast won’t let us look at the underlying data, their model is suspect.
That’s true even of the most crucial public data surrounding the outbreak. As of Thursday evening, Texas was one of just seven states to have done at least 50,000 coronavirus tests, with 4,669 confirmed cases. Of course, that doesn’t mean that there are fewer than 5,000 cases of COVID-19. Results are still coming in far too slowly, and even now, according to multiple sources who spoke to Texas Monthly, not everyone who is experiencing symptoms is able to get a test—to say nothing of the people who don’t have health insurance or are otherwise not calling their doctor to tell them they’re feeling sick. There are also questions about whether the tests themselves are returning accurate results, and even seemingly firm numbers like the number of people who’ve died from the virus are probably estimates. In Italy, for example, the official count of coronavirus deaths was likely underestimated by as much as half, as officials didn’t have a way to count those who died at home or weren’t diagnosed with the disease.
We’re all trying to make sense of the world, but one of the challenges of living in a chaotic time is that the world doesn’t make much sense. So let’s focus on what we do know: We know that the rate of confirmed cases in Texas is smaller than in most other states that have done similar amounts of testing (or more). We know that rural parts of Texas and Southern states are seeing people travel further distances than folks in the north. We know that our fellow Americans in the Northeast are living through a widespread tragedy that is less widespread here. But if you want to know when this time in our lives will end, or whether you and your loved ones will make it through safely, looking to a map of cellphone data made by an analytics company isn’t going to provide the answers.