Surprise: Models Slip and Fall Yet Again

Near the beginning of the pandemic I wrote a feature about the problem with predictive models, which were at the time being continuously cited to justify unprecedented restrictions on normal life. Time has shown the models that were in operation at the time to have been, in some respects, too gloomy, and in others, too optimistic. Which is really to say that they ended up being just as unreliable as I argued at the time, and should not, therefore, have been the north star we used to navigate through the choppy WuFlu waters.

Well, for a more recent demonstration of this phenomenon, Nate Silver of FiveThirtyEight rightly points out that actual Covid case numbers have fallen well below what all twenty-two CDC approved models predicted back in early May. Most (though not all) held that Covid cases would continue to decline in the U.S., pushed down by the combined forces of vaccine reception and warm weather, with the average model suggesting that we'd end up at about 28,000 cases per day. In real life that decline did indeed occur, but we've ended up with roughly half of the projected number of cases, and this with states opening much faster than the CDC was recommending at the time the models were released.

Wrong about Covid, wrong about climate change.

As I discussed in the initial post, the numerous failures of predictive models  are well known to close watchers of the climate debate. We've watched a familiar cycle -- where a new model which anticipates calamity is released, inspires ominous headlines and hand wringing from professional activists and politicians, and is eventually revised when reality fails to conform to it. Of course by then the damage is done, and the headlines are burnt into the brains of regular people who don't have the time or capacity to debunk every bit of misinformation thrown their way.

Dare we hope that the failures of the pandemic are enough to open peoples' eyes? Well, if this Gallup poll is correct that 71 percent of Democrats and more than 40 percent of independents think the case number decline is a mirage, and we should continue to stay home for the foreseeable future, the answer to that is probably 'No.'

There's Something About Models...

Not, uh, those kind of models. We know what it is about them.

I'm talking about predictive models, whose object is to use whatever data is available to map the statistical likelihood of particular future events. These days such models are roughly as numerous as air molecules, since businesses and governments are obsessed with mitigating risk. As man's ability to travel through time has been unfortunately slow to develop and the traditional ways of obtaining knowledge of the future -- visiting fortune tellers or examining the entrails of animals sacrificed to the gods -- are currently out of fashion, predictive models are pretty much all we're left with.

I don't mean to suggest that these models are completely worthless, only to emphasize that they are by definition based on incomplete data and must always be taken with a grain of salt. Sometimes, depending on the amount of data lacking, with a whole salt mine.

Even so, we are continually seeing them cited without qualification as if they were actual intel reports from the future. Just last week the Institute for Health Metrics and Evaluation caused a minor panic when it released its model's then-new projection of the progression of COVID-19 (to which the White House's modelling is heavily indebted). That projection was fairly dire, predicting a death toll between 100,000 and 240,000 in the U.S. by the end of this thing, even with our present social-distancing measures in place. Well, fast-forward just a single week and, after it was widely noted that the number of deaths in New York City was leveling off and hospitalizations were declining even though IHME's model held that both would increase for five more days, the institute announced that they had significantly revised their projections, reducing the total death number by 12 percent and the total number of necessary hospital beds by 58 percent. Which is great, but it also makes you a little suspicious of their new numbers.

Benny Peiser and Andrew Montford of the indispensable Global Warming Policy Forum have a piece in the Wall Street Journal about this same issue. They begin with a discussion of the two principal British models of the pandemic's progression, which present wildly different conclusions, and upon which the British government is making its decisions:

Several researchers have apparently asked to see Imperial [College]’s calculations, but Prof. Neil Ferguson, the man leading the team, has said that the computer code is 13 years old and thousands of lines of it “undocumented,” making it hard for anyone to work with, let alone take it apart to identify potential errors. He has promised that it will be published in a week or so, but in the meantime reasonable people might wonder whether something made with 13-year-old, undocumented computer code should be used to justify shutting down the economy.

I'll say!

Peiser and Montford's work at the GWPF make them uniquely qualified to comment on the unreliability of predictive models, because long before that fateful bowl of bat stew changed the world, climate scientists were dramatically announcing the headline-grabbing conclusions of opaque processes and fuzzy math. For one extremely significant example of this, check out this interview with Ross McKitrick, a Canadian economist who began applying his expertise in statistical analysis to climate change studies and was surprised by what he uncovered. Rex Murphy is the host:

At the 40:40 mark, Dr. McKitrick tells the story of how he and a Toronto mining executive named Stephen McIntyre began looking into the data used by American climatologist Michael Mann in developing his Hockey Stick Graph. That graph had displaced the general climate consensus of the time, which held that climate had always moved in waves of alternating warm and cold periods, and purported to show, through the examination of tree rings, "that at least for the past thousand years it was really just a straight cooling line and then you get to the 20th century and the temperature begins to soar very rapidly. We're riding up the blade of the stick."

In the clip above, McKitrick discusses the origins of his skepticism concerning Mann's theories, which had revolutionized the field of climatology and given rise to mountains of carbon regulations. After finally accessing what appeared to be the underlying data set and trying to replicate Mann's conclusions, says McKitrick, the hockey stick graph "really wasn't robust, that you could get all kinds of different shapes with the same data set based on minor variations in processing. We also identified some specific technical errors in the statistical analysis. Really, our conclusion was, they can't conclude anything about how our current era compares to the medieval period. The data and their methods just aren't precise enough and they're overstating the certainty of their analysis... The methods were wrong, the data is unreliable for the purpose, and I would just say that that graph is really uninformative about historical climate."

[Not everyone agrees with McKitrick's analysis, as you can read here.]

Earlier in the interview, McKitrick gives an apt summation of what predictive models are good for, and what they're not good for:

I think there are going to be some reckonings, especially for the climate modelling industry. They've got some big failures to deal with. And that wouldn't be a problem if people understood that models of any kind, including climate models, are really study tools. They're ways of trying to understand the system, because the system is too complicated to figure out, so you build a simplified climate model of it and you try to figure out how that works and you hope that you learn something. But when it's set up as, this is a forecasting tool that we can make precise calculations with and base policy decisions on, then we're entitled to ask 'Well, how good of a forecasting tool is this?" And they don't work very well for that.

In the case of the Wuhan coronavirus, you would be quite right in arguing that, as data is lacking -- because this is a novel virus and it spread so quickly -- or unreliable -- because the nation which has had the longest time to study it also has a vested interest in making it seem like they managed it better than they did -- our governments needed to act hastily on the ominous predictions of their faulty models. Whether they should refuse to revise the steps they have taken as increased, accurate data affects the models is another question entirely.

On the climate question, however, the models have been around for quite some time, and their weakness are apparent. As the late theoretical physicist Freeman Dyson famously put it, upon examining them closely, "They do not begin to describe the real world that we live in."

Predictive models are comforting, because they make us feel like we know what is going to happen, and we can act accordingly. But sometimes the real world, in all of its messy unpredictability, intrudes. Here's hoping that our adventure with the WuFlu teaches us to be a little more cautious about throwing everything away on an incomplete data set.