Election predictions: why the models differ

I see quite a bit of angst over the predictions of the upcoming general election. So I hope to explain the basic difference in philosophies of the competing models.

First, here is the obligatory map; this time I used Predictionwise which uses a blend of betting markets, polls and other data to assign a “probability percentage” of winning the individual states. The map I present shows the blue states as one where Hillary Clinton has a 62 percent probability (or higher) of winning (by this model) and then explain what happens if one wants a higher threshold (say 80 percent, then 90 percent)


Now there are other models out there; fivethirtyeight gives Trump the highest probability of winning; Princeton gives him the lowest.

Why the difference? If you want full details, read Nate Silver’s explanation of the difference in models and his explanation as to why, though Clinton and Obama were in similar positions with regards to the popular vote, Obama was in better position with regards to the Electoral College.

First, look at this chart, taken from Upshot: (I cut out the many of the “safely Democratic” and “safely Republican” states, and attached the header so you can see which model the estimates came from)


Note the 127 “close” states that Trump has to win.

Now consider two “extreme models” (both Nate Silver and Sam Wang are too competent to use either of these, but these extremes can explain the difference in confidence):

Extreme model 1: the vote percentage in the states is in lock step with the national averages. What that means: say Clinton’s average is 45 percent and in, say, Wisconsin, she is 3 points above that. Then Wisconsin is labeled as “D + 3” meaning she’ll get 3 points more than the national average. Now if there is a shift in the national polls, or if the national polls are just a bit off, that shift will be reflected in each state. For example, say the polls shift 4 points in Trump’s direction so Clinton’s average is 43 percent nationally. Then in this model, “D + 3” now becomes 46, down from 48. And that happens IN EVERY STATE.

Therefore a 2 point lead in each swing state becomes a 2 point deficit in each swing state, which indicates that Trump has a reasonable chance to win all of those close states, given a national surge or, say, the polls being off by a bit. Hence the uncertainty.

Of course, this works in the other direction as well; if the polls shift toward Clinton, she could win by a landslide. That explains the relevance of this remark by Nate Silver.


Now one could use the other extreme model: that the swing states are independent. That is, say, an increase in Trump support in New Hampshire is not correlated with an increase in Trump support in, say, Nevada. Now by that model, Trump is cooked; his chances of winning ALL of those tightly contested 127 electoral votes is basically zero, hence Sam Wang’s statement:


Now Wang is way too competent to make the simplistic assumption that the state results are independent of one another. But one has to remember that Clinton is using a sophisticated voter targeting operation in key states (her “firewall states”) and Trump has contempt for such operations. So a small Trump surge nationally might not help him close the gap in those states. Obama’s campaign manager Jim Messina explains that there.

Again, neither Silver or Wang use these extreme models; they are way too competent to do so. But their models weight uncertainty and polling error and the statistical independence of the states differently, hence the difference in probability.

In a nutshell: Silver’s model has a wider “confidence interval” for the number of Electoral Votes (hence, higher probability of a Trump win or a Clinton landslide) and Wang’s confidence interval is smaller (centered around a modest but solid Clinton win in the Electoral College).

November 4, 2016 Posted by | political/social, politics, politics/social, statistics | , , , | 3 Comments