Every few days the NY Times publishes extremely detailed presidential polling data (Read here) collected and analyzed by Nate Cohen, the NY Times polling guru
If you believe the polls it looks pretty good for Biden, with caveats. Since I’m a nerdy type of guy I looked back at my blogs from 2016, leading up to the election,
Every day the Times have a graphic, the percentages predicting the presidential election outcome. In July Hillary was in the mid-eighties and by mid-September had dipped to the mid-seventies. This morning Hillary hit 89% – the highest Times election prediction.
The Nate Silver fivethirtyeight blog predicts Clinton 86.4% and 341 electoral votes (270 needed for victory), and, in the popular vote Clinton leads 49% to 42%. (October, 15, 2016)
Why was the NY Times so wrong?
A few of my August through November, 2016 blogs,
I had my doubts about the seemingly insurmountable Clinton lead in the polls, read a blog titled “Is there a Brexit election parallel, are millions of Trump voters hiding in the weeds?” (September 5, 2016).
As I wrote a few days ago the pollsters were wrong in the Brexit vote and the reasons can be just as applicable in the presidential.
The pollsters, the talking heads, give Clinton a lead, whether expressed as a percent (currently 86-14 Clinton) or the more common percent comparing candidate to candidate (Clinton leads in the high single digits).
Are there voters who are still undecided?
Will the Bernie voters come to the polls for Clinton?
Will the younger voters flock to the polls as they did in 08 and 12?
Will women and minority voters vote for Clinton in unparalleled numbers?
Will white males and older voters come to the polls in large numbers for Trump?
In the Brexit election “leave” voters were under the radar, the pollsters simply missed the leave voters or perhaps the leave voters avoided participating in the polling process.
Is the same phenomenon possible in our presidential election?
Voters may very well decide on who they dislike least.
A little tutorial: What is a poll?
A poll is a photograph of a sub-set of likely voters in a particular time and place.
By likely voters we mean “prime voters,” folks who have voted in four of the last five elections. “How do they know if I voted? Isn’t my vote secret?” Yes, and that big “but,” whether you vote or not is a public record. A lucrative business is disaggregating and selling voting data. Prime NY is the leading purveyor of election data in New York State. A number of community organizations in my neighborhood opposed a city plan, we got together and discussed how to mobilize the neighborhood residents. Someone suggested a mailing to residents; however, that would be expensive: how do you raise the dollars? I smiled, for a modest sum, extremely modest, I purchased e-mail addresses for registered voters in the neighboring election districts – a few e-blasts got the attention of the local electeds.
Pollsters identify a pool, a subset that reflects the larger population to be polled. We used to call the subset a stratified, random sample, a microcosm of the total population to be polled.
The subset should reflect prime voters, by gender, age, race, education, income, the more variables the more accurate the poll, and, the more expensive the data becomes.
How do we contact our representative sampling of prime voters?
The issue is the non-response rate which is gigantic. In a world of cell phones, caller IDs, spam blockers potential responders can easily choose whether or not to answer a call. The non-response rate erodes the accuracy of the poll.
How accurate are polls?
Polls should include an “error of measurement,” a technical term. Read an excellent description of error of measurement from the Pew Trust here
A margin of error of + or – 3% (a six percent range) means if one candidate leading 52-48 the poll is within the margin of error, the poll is actually a statistical tie.
A few years before the 2016 election I met Howard Wainer, the author of scores of articles and books dealing with statistics and the long time editor of the leading journal of statistics at an NYU-sponsored panel on Value-Based Measurements (VAM), the statistical model touted to enable school district leaders to “measure” and “rank” teachers by student grades on standardized tests. Howard was magnificent, he basically told the education professor, a VAM acolyte if he was a student in Howard’s class and made the same arguments he would fail him.
I interviewed Wainer on the cusp of the 2016 election: he had many doubts about the efficacy of modern polling, especially the efforts account for low response rates by “weighting” responses. Read here.
Although it is well known that being a statistician means never having to say you’re certain (nothing in life is ever better than 3 to 1), I feel safe in betting the farm on Hillary (regardless of the release of emails). And also a Democratic Senate.
No matter your expertise voters are fickle, easily influenced by the world of social media and, of course, we never know what is simmering below the surface. Likeability influences voters more than policy differences, and, we never know whether that bit of white cloth showing is a Klan robe.
In 2016 Comey’s re-opening of the Hillary’s e-mail investigation, Trump voters “hiding in the weeds” and Bernie voters staying home all were beyond the ability of polls to predict.
Are the current polls missing disenchanted Trump voters, Biden voters “hiding in the weeds” and highly motivated younger voters and women and voters of color?
Politico is exuberant over current polling data.
The only poll that counts is the one on November 3rd.