Harris Defeats Trump. Trump Defeats Harris.

August 5, 2024      Kevin Schulman, Founder, DonorVoice and DVCanvass

Conventional wisdom and a veteran political analyst who had predicted the winner in four of the five prior presidential races leading up to this 1948 election were clear; not only would Dewey win but Republicans would retain control of the House and Senate.   Dewey of course lost and the Democrats won control of both chambers.

What went wrong?  Some of the same things that went wrong in 2016 and 2020 polling.  Consider this a cautionary note as we head into the 2024 silly season.

The polls of 1940  – 1948 all underestimated the percentage of the actual vote comprised by Republicans – i.e. sampling error with too many Democrats included in the poll prediction.   Those prior years they weren’t more right, they were more lucky and things really came home to roost in 1948.

Fast forward to more modern times.  We’re now overwhelmed with state and national polls every election season.  And more isn’t always better.  The number of state polls conducted between September and the end of October 2020 in Wisconsin, Pennsylvania, Arizona and Michigan nearly doubled from 2016.  But, if the error is correlated in the same direction (it was in all but Arizona) then you make matters worse with the illusion of greater precision and mistaking it for accuracy (i.e. non-bias).

The post-mortem coming out of 2016 and the large, important misses in the midwestern states (i.e. Wisconsin, Michigan, Pennsylvania) was, once-again, sampling error; this time too many high, formal, education (college grad or higher) folks included, which of course missed a big part of the Trump vote.  All the pollsters heading into 2020 got this message and weighted their sample on education level (among other factors).

So, what happened in 2020?  In every swing state but Arizona, Trump wildly outperformed the polling average from one of the more popular polling aggregator sites, FiveThirtyEight (538); Trump was +7 (over 538 prediction) in Ohio, +8 in Wisconsin, +6 in Florida.

So what went wrong in 2020?  The survey data got worse over the last four years canceling out the (superficial) changes pollsters made to address what went wrong in 2016.

Who Cares?  Know this, there is some evidence that all these polls don’t just predict behavior, they influence it.  Namely, decreasing the likelihood – even a bit – of people bothering to vote and this effect can happen for the predicted winner. 

What to Do?  The same thing we counsel for charities trying to understand their donors, dig deeper.

College education is merely a proxy for vote preference, demographics are easy to collect or buy and easy to report on and mentally process but rarely are they causal or illuminating.  Reducing voter intent and preference to a horse race question and weighting by a few demographics leaves lots of room for error.

Start getting more refined measures of voters.  For example, there are,

  • Measures of trust that may correlate with ideology and intention or
  • Identity and salience of that Identity –  how strongly do I identify as a Libertarian or Democrat or Conservative?
  • Personality – Agreeable people skew towards Liberal and Conscientious towards Conservative
  • Moral Frame – certain people view moral decisions, of which voting can certainly be included, on the basis of fairness/equity (Liberals disproportionately) while others use a loyalty/authority lens (Conservatives disproportionately).

The point isn’t for any of these measures to replace the horse race question but rather, augment it, via modeling.  It may well be that the best way to measure candidate preference is to stop treating the ubiquitous horse race survey question as the “answer” in polls.

And what about the forecast prognosticators that build fancy models incorporating all these state and national polls?  Well, these firms like 538 add in other ‘fundamentals’ (e.g. economic data and candidate favorability data) to make their probabilistic forecasts (e.g. Biden has 84% chance of winning Electoral College) more accurate.

However, what these firms also build into their models is big, hairy error terms.  These error terms are there, in part, to account for the “known unknowns” problems with state polling.  But, these error terms being added in are, at best, guesses.  More accurately, they are subjective whims that try to add known uncertainty about election outcomes lest they live with a model that seems too good to be true – e.g. Biden has a 95% of winning.  Replacing these error terms with better data is always a good idea and these deeper, more causal, psychological factors might help.

Kevin