23 August 2016

The 538 Election Forecast Is Too Conservative

For the most part, the 538 Election Forecast is a great tool.  But, it is far too generous to the underdog in my opinion, particularly in its "Now Cast", mostly because it uses a fat tailed t-distribution with an inappropriately small number of degrees of freedom.

The Standard Normal Distribution is a t-distribution with infinite degrees of freedom that is appropriate for large sample sizes.  The t-distribution is used for small sample sizes.  But, 538 inappropriately assumes that its sample size was the number of elections used to build the model, rather than the number of individuals surveyed by the surveys that go into its analysis (although due to weighting, the number has to be adjusted down somewhat).  This is appropriate when establishing the margin of error for the non-survey components of the model, but is not appropriate when establishing the margin of error for the survey components of the model which is largely a function of the number of individuals surveyed in each state discounted by the fact that some surveys are old. The model removes almost all systemic error and bias from the surveys leaving pure statistical error.

In practice, so many surveys are considered, each with reasonably large sample sizes, that the degrees of freedom in their forecast are almost indistinguishable from a standard normal distribution.

If a normal distribution or a t-distribution with a much larger number of degrees of freedom were used, instead of a t-distribution, Trump's chance of winning would be significantly smaller, particularly in the "now cast" which has almost no non-survey input.

In gamblers payout form, 538 has the odds of Trump winning if an election were held today at about 5-1.  The fair odds given the data going into the model should be much larger, something on the order of 20-1 or more if an election were held today (given the amount of Clinton's lead in the marginal state necessary to secure 270 electoral votes and the margin of error of the combined survey data for that state adjusted by possibilities that non-marginal states could make a difference even when the marginal state does not have the expected result).

It also bears noting that while it is 11 weeks until election day, that a significant share of the vote will be cast via mail in ballots, early voting, absentee ballots and other means as much as 4 weeks before election day.  The time that will elapse between today and when the average voter will cast a ballot is probably something like 9-10 weeks out, which means that weighting of surveys relative to other data is systemically undercounted, although not by nearly as significant a margin, because the difference is modest and non-polling factors are almost a wash this year.

2 comments:

Dave Barnes said...

I think there is a strong "Eastern" bias in looking at elections. Most eastern states have the vast majority of ballots cast on "Election Day". These people just do not understand, at a gut level, how vote by mail works and is changing election patterns.

andrew said...

Very likely.