OK–Let’s Talk About Those Polls

Survey research ain’t what it used to be.

Back in 2020, the Harvard Business Review summarized the changes that have diminished polling accuracy. The article described the industry as “living on borrowed time,” and predicted that its increasing errors would not be soon–or easily–corrected.

The basic problem is low response rates. Thanks to caller ID, fewer Americans pick up the phone when a pollster calls, so it takes more calls to reach enough respondents to make a valid sample. It also means that Americans are screening themselves before they pick up the phone.

So even as our ability to analyze data has gotten better and better, thanks to advanced computing and an increase in the amount of data available to analysts, our ability to collect data has gotten worse. And if the inputs are bad, the analysis won’t be any good either.

It now takes 40+ calls to reach just one respondent. And there really is no reliable way to assess how those who do respond differ from those who don’t. (I know my own children do not answer calls if they don’t recognize the phone number–are they representative of an age group? An educational or partisan cohort? I have no idea–and neither do the pollsters.) There are also concerns that those who do respond are disproportionately rural.

These things matter.

A sample is only valid to the extent that the individuals reached are a random sample of the overall population of interest. It’s not at all problematic for some people to refuse to pick up the phone, as long as their refusal is driven by a random process. If it’s random, the people who do pick up the phone will still be a representative sample of the overall population, and the pollster will just have to make more calls.

Similarly, it’s not a serious problem for pollsters if people refuse to answer the phone according to known characteristics. For instance, pollsters know that African-Americans are less likely to answer a survey than white Americans and that men are less likely to pick up the phone than women. Thanks to the U.S. Census, we know what proportion of these groups are supposed to be in our sample, so when the proportion of men, or African-Americans, falls short in the sample, pollsters can make use of weighting techniques to correct for the shortfall.

The real problem comes when potential respondents to a poll are systematically refusing to pick up the phone according to characteristics that pollsters aren’t measuring…. if a group like evangelicals or conservatives systematically exclude themselves from polls at higher rates than other groups, there’s no easy way to fix the problem.

As the article notes, with response rates to modern polls below 15%, it becomes extremely difficult to determine whether systematic nonresponse problems are even happening.

These problems go from nagging to consequential when the characteristics that are leading people to exclude themselves from polls are correlated with the major outcome that the poll is trying to measure. For instance, if Donald Trump voters were more likely to decide not to participate in polls because they’re rigged, and did so in a way that wasn’t correlated with known characteristics like race and gender, pollsters would have no way of knowing.

Then there’s the failure of likely voter models.

People tend to say they’re going to vote even when they won’t. Every major pollster has its own approach to a “likely voter” screen, but they all include a respondent’s previous voting behavior. As long as that behavior stays stable, these models work. But when something generates turnout among voters who have previously been absent, all bets are off. That happened when the Obama campaign energized previously apathetic voters, and since the Dobbs decision overturning Roe v. Wade, we’ve seen evidence of significantly increased registration and turnout among women who hadn’t previously voted.

As the Harvard article noted,

It may be the case that standard sampling and weighting techniques are able to correct for sampling problems in a normal election — one in which voter turnout patterns remain predictable — but fail when the polls are missing portions of the electorate who are likely to turn out in one election but not in previous ones. Imagine that there’s a group of voters who don’t generally vote and are systematically less likely to respond to a survey. So long as they continue to not vote, there isn’t a problem. But if a candidate activates these voters, the polls will systematically underestimate support for the candidate.

Polling is broken, and we need to stop hyperventilating about their results. Remember, Trump has consistently underperformed his polling percentages in every primary thus far this year.
As the saying goes, the only poll that counts is the one on election day.

About Those Polls….

A recent polling “primer” intended for journalists has some useful cautions for all of us being inundated with reports about the “latest polling results” in this weird campaign season.

We are always (usefully) reminded that even the best polls are but a snapshot of public opinion at the time the poll is fielded, so results depend upon what voters have heard and seen at that particular time. Subsequent campaigning can–and more often than not, does–change those perceptions.

It also should not be news that some polls are more equal than others: good polls are expensive, and a lot of what’s out there is at best unreliable and at worst, garbage. Composition and size of the respondent pool (the sample), design of the questions and a number of other flaws can make some surveys worse than useless.

But in addition to those standard cautions, recent changes in communications and the willingness of the public to answer questions cast further doubt on the accuracy of even the better-designed polls.

It should go without saying that “click on our link and tell us what you think” internet polls are worthless.

The increased use of mobile phones, especially, has challenged polling operations. That’s particularly true because there are significant differences in the populations that use cell phones and those who continue to keep their landlines, posing a huge challenge for the algorithms pollsters use to compensate for inability to reach mobile devices.

Further compounding the problem, the number of people willing to talk to a pollster when they are contacted has steadily declined; some estimates are that a mere 5% of those who answer their phones are willing to answer survey questions. Even if the number in the sample is increased in an effort to compensate, it is highly likely that the people who are willing to talk differ in some relevant ways from those who aren’t.

We saw the consequences of all this recently in the Michigan Democratic primary. The best polling has come a long way since “Dewey Beats Truman”–but most of what earns headlines isn’t the best polling.

The troubling aspect of this is that even garbage polls have the ability to affect people’s perceptions and ultimately, to affect election results.