A New England Journal of Medicine paper implies that 4,645 Puerto Ricans may have died because of Hurricanes Irma and Maria. Using the actual number of deaths my best guess is that the hurricanes may have caused 685 deaths. This article will explain why Maria did not kill 4,645 Puerto Ricdans in 2017.
The authors of the NEJM paper correctly note that the 95 percent confidence interval is 793 to 8,498. The estimate of 4,645 is simply the midpoint between the two numbers. (The actual number is 4,645.5, but fractional deaths are not allowed by the laws of nature.) My mean estimate is below their lower bound, indicating serious statistical problems. In fact, my confidence interval is between 264.32 and 1,106.07. That barely overlaps the lower bounds of theirs.
The NEJM Study
Hurricane Maria made landfall in Puerto Rico on September 20, 2017. The island had already been hit hard by hurricane Irma a few days before. The authors gathered their data by surveying 3,299 households in Puerto Rico between January 17 and February 24, 2018. These dates are important because respondents were only asked about the period September 20 through December 31. This is the first problem with their study. Why did the authors assume Maria-related deaths ceased starting January 1, 2018? (It turns out they were probably correct, but they don’t give a reason for using the December 31, 2017 cutoff.)
The paper is “Mortality in Puerto Rico after Hurricane Maria” (published online at NEJM.org May 29, 2018). The estimate of 4,645 is in sharp contrast to the official estimate of 64 deaths. The Puerto Rico government has admitted this number is too low and has commissioned a study by the Milken Institute School of Public Health at George Washington University to come up with a better figure.
The First Statistical Issue
Despite having access to monthly death statistics from 2000 through 2017, the authors chose not to do some simple statistical work. In fact, after looking at actual death statistics, it appears they should have only asked about September and October, 2017. And, despite having monthly mortality data from 2010 through 2017, they used only 2016 for comparison. Later I’ll show the results from using the average per month from 2010 – 2016.
The authors are careful to use the phrase “excess deaths” meaning deaths above what they would have expected. While the study implies the deaths were caused by the hurricane, that conclusion can’t be drawn from their research.
Data Collection Methodology
The authors used a team of doctoral students in clinical psychology from local universities who had been “part of earlier outreach operations, and were familiar with the terrain and the mental health issues communities may be facing. All enumerators received training, and group-wide debriefs were conducted at regular intervals.” They questioned a total of 3,299 randomly chosen households. For multi-person households, the survey administrator asked one member questions about all the others. While this practice is used, for example, in the U.S. Current Population Survey to calculate U.S. unemployment statistics, it smacks of reliance on anecdotal evidence here. The respondents were undoubtedly still suffering from post-traumatic stress. It’s easy to believe that the data gathered may not have been very accurate.
The authors report data on 9,522 individuals. But a closer look at the data reveals 56 reported deaths among those 9,522 people., a sample size that can’t produce meaningful statistics. Even those 56 deaths are more than the 19 deaths respondents actually said were caused by Maria. In all fairness, the authors report the very large confidence interval mentioned earlier (793 to 8,498). Let’s see where those numbers come from. (The authors posted their dataset on Github. I relied on their data for much of this work.)
Causes of Death
The authors coded deaths with eleven possible causes. Twelve deaths are coded “cause of death unrelated to the hurricane.” Another seven are coded “other reason.” There are 18 coded “– – –“, cause is not listed in the authors’ data dictionary. Adding these up, we see 37 of the 56 deaths cannot be attributed to Maria. Only 19 deaths were “caused” by the hurricane.
It’s tempting to do some simple calculations using 19 deaths, the number of people surveyed, and the island’s population. That won’t work because the authors did not use a random sample of households. Their sample was stratified, oversampling rural areas. While that is statistically appropriate, it also means you can’t simply use the apparent mortality rate from their sample and apply it to the entire population.
Having found 19 “excess deaths” (by implication caused by the hurricane), the authors next compare mortality during the post-hurricane months in 2017 with the mortality rates in the same months in 2016. (Why not average 2010 to 2016 as I did?) They applied the 2016 mortality rate to the 2017 population to produce the number of deaths that would have occurred if 2017 had been like 2016. They then look at the number of deaths extrapolated from their survey and imply the entire difference was caused by the hurricane.
Puerto Rico Was a Mess Before the Hurricanes
But here’s another possibility. Puerto Rico’s infrastructure, especially the electricity grid and roads, were a mess before Maria. The hurricane damage was exacerbated by preexisting bad infrastructure. Unmaintained telephone poles break in high winds. Potholes get larger when it rains hard.
How many deaths can be attributed to the poor pre-hurricane conditions? That is an impossible question. But a more important question is, “What about 2018?” The idea that hurricane-caused deaths suddenly stopped on December 31, 2017 seems silly. That’s especially true because on that date about 38 percent of the electric company’s customers still had no power. The Puerto Rico Department of Health supplied me with death statistics for the first five months of 2018. Using this data I did a slightly different calculation.
It turns out I was wrong. There is no statistical difference between January – April, 2018 and the same months in 2010 – 2017.
Actual Data Tells the True Story
First things first. The data is by month. Hurricane Maria hit on September 20. I have no way of estimating deaths for the last ten days of September. Therefore, I just divided the number of deaths in November by three. This undoubtedly understates the actual number of deaths in late September. But it’s real data.
Next, I calculated the average number of deaths for each month from October through April 2010 through 2016. I divided the number of September deaths by three in each of those years to match the 2017 calculation. Then I performed a standard t-test to see if the number of deaths post-Maria was significantly different from the mean of the previous seven years. Here are the results:
For those who have forgotten that statistics course you took in college, here’s the story. The difference between the number of deaths post-Maria and the average in the preceding seven years is only statistically significant in September, October, and December. The total for September and October is 685 excess deaths. The authors included one month – November — when there was no statistical difference between the post-Maria environment and the previous data. And in December, remarkably, there were negative “excess deaths.” The data says there were 496 fewer deaths that month than the average for the previous seven years. And, interestingly, this figure is statistically significant. If we add that to the totals from September and October, the net total is 189 “excess deaths.”
If you want to fool with the data, click here for an Excel worksheet.
A few days before Irma and Maria hit Puerto Rico, the U.S. Virgin Islands were devastated. To estimate the effect of Puerto Rico’s weakened electricity grid, I gathered data on the percentage of electricity customers without power for various dates. (The sources are in my Excel workbook. Click here to download the file.)
On May 31, 2018, one percent of Puerto Rico’s electric power customers were still offline. The forecast was two more months to achieve 100 percent. That’s 314 days post-Maria to fully restore power.
The USVI were devastated one day before Puerto Rico. In some respects, the USVI presented a worse situation than Puerto Rico. The USVI includes three separate islands. While there are a few minor islands under Puerto Rico’s purview, most of the work was on the main island.
A bigger difference is population. The USVI have 107,268 residents. Puerto Rico is home to 3.3 million people. Nevertheless, there’s a bit of information comparing the two.
The USVI took 171 days to achieve 100 percent coverage. That’s 54.3 percent of the time Puerto Rico took. That figure is less important than it may sound. Recall that there was no statistically significant difference between the average deaths pre-hurricane and the actual deaths in January – April, 2018. That implies there were no “excess deaths” in those months. But let’s look at the USVI data anyway.
In some sense, the hurricanes were a lucky break for Puerto Rico. Both the island’s government and its electric power utility had defaulted on their respective debts. The hurricane did three things for the power grid.
- It is being rebuilt from the ground up at very little cost to the island.
- The power system has been privatized and will be run like a business from now on.
- The aid pouring into the island gives the government a bit of a bailout.
That’s really stretching for a silver lining given the extent of misery Puerto Ricans experienced. Take the good where you can find it.
(Thanks to James Taranto of the Wall Street Journal for helpful feedback.)