The Models Were Telling Us Trump Could Win

Nate Silver got the election right.

Modeling this election was never about win probabilities (i.e., saying that Clinton is 98% likely to win, or 71% likely to win, or whatever). It was about finding a way to convey meaningful information about uncertainty and about what could happen. And, despite the not-so-great headline, this article by Nate Silver does a pretty impressive job.

First, let’s have a look at what not to do. This article by Sam Wang (Princeton Election Consortium) explains how you end up with a win probability of 98-99% for Clinton. First, he aggregates the state polls, and figures that if they’re right on average, then Clinton wins easily (with over 300 electoral votes I believe). Then he looks for a way to model the uncertainty. He asks, reasonably: what happens if the polls are all off by a given amount? And he answers the question, again reasonably: if Trump overperforms his polls by 2.6%, the election becomes a toss-up. If he overperforms by more, he’s likely to win.

But then you have to ask: how much could the polls be off by? And this is where Wang goes horribly wrong.

The uncertainty here is virtually impossible to model statistically. US presidential elections don’t happen that often, so there’s not much direct history, plus the challenges of polling are changing dramatically as fewer and fewer people are reachable via listed phone numbers. Wang does say that in the last three elections, the polls have been off by 1.3% (Bush 2004), 1.2% (Obama 2008), and 2.3% (Obama 2012). So polls being off by 2.6% doesn’t seem crazy at all.

For some inexplicable reason, however, Wang ignores what is right in front of his nose, picks a tiny standard error parameter out of the air, plugs it into his model, and basically says: well, the polls are very unlikely to be off by very much, so Clinton is 98-99% likely to win.

Always be wary of models, especially models of human behavior, that give probabilities of 98-99%. Always ask yourself: am I anywhere near 98-99% sure that my model is complete and accurate? If not, STOP, cross out your probabilities because they are meaningless, and start again.

How do you come up with a meaningful forecast, though? Once you accept that there’s genuine uncertainty in the most important parameter in your model, and that trying to assign a probability is likely to range from meaningless to flat-out wrong, how do you proceed?

Well, let’s look at what Silver does in this article. Instead of trying to estimate the volatility as Wang does (and as Silver also does on the front page of his web site, people just can’t help themselves), he gives a careful analysis of some possible specific scenarios. What are some good scenarios to pick? Well, maybe we should look at recent cases of when nationwide polls have been off. OK, can you think of any good examples? Hmm, I don’t know, maybe…

brexit-headlines

Aiiieeee!!!!

Look at the numbers in that Sun cover. Brexit (Leave) won by 4%, while the polls before the election were essentially tied, with Remain perhaps enjoying a slight lead. That’s a polling error of at least 4%. And the US poll numbers are very clear: if Trump overperforms his polls by 4%, he wins easily.

In financial modeling, where you often don’t have enough relevant history to build a good probabilistic model, this technique — pick some scenarios that seem important, play them through your model, and look at the outcomes — is called stress testing. Silver’s article does a really, really good job of it. He doesn’t pretend to know what’s going to happen (we can’t all be Michael Moore, you know), but he plays out the possibilities, makes the risks transparent, and puts you in a position to evaluate them. That is how you’re supposed to analyze situations with inherent uncertainty. And with the inherent uncertainty in our world increasing, to say the least, it’s a way of thinking that we all better start becoming really familiar with.

The models were plain as day. What the numbers were telling us was that if the polls were right, Clinton would win easily, but if they were underestimating Trump’s support by anywhere near a Brexit-like margin, Trump would win easily. Shouldn’t that have been the headline? Wouldn’t you have liked to have known that? Isn’t it way more informative than saying that Clinton is 98% or 71% likely to win based on some parameter someone plucked out of thin air?

We should have been going into this election terrified.

Advertisements

Two Kinds of Model Error

One winter night every year, New York City tries to count how many homeless people are out in its streets. (This doesn’t include people in shelters, because shelters already keep records.) It’s done in a pretty low-tech way: the Department of Homeless Services hires a bunch of volunteers, trains them, and sends them out to find and count people.

How do you account for the fact that you probably won’t find everyone? Plant decoys! The city sends out another set of volunteers to pretend to be homeless, to see if they actually get counted. (My social worker wife gets glamorous opportunities like this sent to her on a regular basis.) Once all the numbers are in, you can estimate the total number of homeless as follows:

  1. Actual homeless counted = Total people counted — Decoys counted.
  2. Percent counted estimate = Decoys counted / Total decoys
  3. Homeless estimate = Actual homeless counted / Percent counted estimate

For example, say you counted 4080 people total out in the streets. And say you sent out 100 decoys and 80 of them got counted. Then the number of true homeless you counted is 4000 (= 4080 — 80), your count seems to capture 80% of the people out there, so your estimate for the true number of homeless is 4000 / 80% = 5000 (in other words, 5000 is the number that 4000 is 80% of).

But it’s probably not exactly 5000, for two reasons:

  1. Random error. You happened to count 80% of your decoys, but on another day, you might have counted 78% of your decoys, or 82%, or some other number. In other words, there’s natural randomness in your model which leads to indeterminacy in your answer.
  2. Systematic error. When you count the homeless, you have some idea of where they’re likely to be. But you don’t really know. And your decoys are likely going to plant themselves in the same general places where you think the homeless are. Put another way, if there are a bunch of homeless in an old abandoned subway station that you have no idea exists, you’re not going to count them. And your decoys won’t know to plant themselves there, so you won’t have any idea that you’re not counting them.

The first kind of error is error inside your model. You can analyze it, and treat it statistically by estimating a confidence interval, e.g., I’m estimating that there are 5000 homeless out there, and there’s a 95% chance that the true number is somewhere between 4500 and 5500, say. The second kind of error is external; it comes from stuff that your model doesn’t capture. It’s more worrying because you don’t — and can’t — know how much of it you have. But at least be aware that almost any model has it, and that even confidence intervals don’t incorporate it.