Polling

RichardGarfinkle · Sep 28, 2012

Polls are an irritating fact of political life, much maligned and badly misunderstood. They also seem to be a source of recurrent P&CE questions and disputes.

This thread is meant to address most of those questions, hopefully answering the basics and concentrating the more sophisticated inquiries into one useful resource.

This thread is meant to deal with those questions, answering the basics and concentrating the more sophisticated inquiries into one useful resource.

Many of these issues have been addressed in a number of threads over the years by our various knowledgeable members, but those answers are scattered rather than stickied in one place. Feel free to add to this thread any insights or knowledge you wish.

Sampling

The underlying statistical principle behind polling is the idea of sampling.

Suppose you have a lot of objects that fall into a few categories. By way of example, suppose you have a bin of ping pong balls, some red, some blue, and some libertarian -- I mean orange.

You wish to know what proportion of the balls are which color. Why? Because you're a statistician and this is what your life has come to -- I mean, because in the land where math problems are invented that's what people do for fun.

There is a perfect method by which to determine this.

Take out the balls one by one and mark down what color each belongs to. When you are done you will have an exact determination of which color got the most votes -- I mean, what the proportion of each color is.

For example, if there were 1000 balls, 300 red, 650 blue and 50 orange, then the results would be:

Red 30%
Blue 65%
Orange 5%.

A thousand balls is an annoying task. Imagine if you had 350 million ping-pong balls. It would be ridiculous to try to count them all.

Suppose however you know the balls are mixed together reasonably well, so that when you pull a ball out your odds of pulling a ball of a particular color are about the same as the proportion of that color in the whole mix.

One of the fundamental theorems of probability and statistics is called the "law of large numbers" (often called the law of averages). One way to phrase what this law says for this particular situation is that the bigger the sample you take the more likely the proportions in the sample are to be very close to the proportions in the entire set of objects you are sampling from.

So, in short, if we can take a sample that is sufficiently large our measurements of the sample are likely to be an accurate reflection of what measuring the whole would be like.

Well, maybe.

First, there's the problem of mixing. If all the orange balls are on top and we scoop from the top we'll end up with a disproportionate number of oranges.

Second, there's the problem that you need a large enough sample. If it's too small your results can easily be skewed by sheer bad luck.

There are statistical methods to discern how good your sample size is and these are used in determining margin of error.

Third, there's the fact that even if you do get a decent sample size the law of large numbers says its unlikely, but not impossible, that you'll get a weird result. That's why sampling once is not enough.

Margin of error and the law of large numbers are both critical in understanding any sampling result. If a result says 50% +/- 3% it means that it is most likely that the answer lies somewhere between 47% and 53%.

Suppose you got the following results in five samplings with a margin of error of 5%:

45%, 39%, 60%, 44%, 42%.

The best guess is that the 60% is an aberration and the others are all within the same range, so probably the true value is somewhere around the low 40%.

But suppose you saw this as a graph. It would have a big dip between the first and second values, then a jump up to 60 then a fall back down. The temptation would be to tell a story of massive volatility in results when in fact they all fit neatly into a single statistical result.

RichardGarfinkle · Sep 28, 2012

People Aren't Ping Pong Balls.

The above description of sampling relies on a single simple externally discernible characteristic of the objects sampled: ping-pong ball color. But polling is after less simple concepts: human attitudes and decisions. These cannot be directly observed. To try to find them out we use that delightful and imperfect tool of writers, language. In particular, pollsters ask questions.

But humans don't respond to even simple questions in a simple fashion.

For example, you sit down to lunch at a restaurant with a friend and ask the innocuous question "What are you going to order?"

You might get "I'll get the steak,"

but you also might hear

"I'd like to get the steak, but my doctor told me to cut down on red meat, but my friend Kimmy who talks to that nutritionist what's-his-name said that I could have red meat as long as I didn't use too much salt, so I always carry a shaker of salt substitute. But my brother-in-law ate here last week and he said the steak was tough and tasteless, but you know he and my sister are going through a rough patch and that can make anything taste bad..."

Eventually when the waiter comes around this person will make a definite decision because food has to be ordered so that lunch can be eaten. But except at that single moment of ordering the person's mind can be a whirl of indecision.

Humans do not necessarily answer even direct simple questions with direct simple answers. Pollsters try to solve this by putting in multiple choice possibilities, restricting the answerer's options. This often means that the question being answered by the answerer is not the question being asked.

Pollsters try to refine their questions in one of two directions:

1. Honest pollsters try to phrase the questions in such a way as to elicit a reasonably useful approximation of the person's opinion as it currently stands.

2. Dishonest pollsters, sometimes called push-pollsters or lying treacherous evil propagandists, try to phrase their questions in order to elicit the answers they want to receive.

The final concern is that people take in information and use it to determine their decisions. Polls can be part of this information. Therefore, people can change their minds about things like who they are going to vote for, whether they are going to contribute to a candidate, and whether they are going to vote at all based on polling results. It may also affect how they answer the next pollster who asks them.

RichardGarfinkle · Sep 28, 2012

Bias and Correction

In order for sampling to be accurate it must accurately represent the population being sampled from.

In a sense there's a paradox here. How can you know beforehand that your sampling method is accurate if you don't already know what the outcome is supposed to be?

The answer is you measure your sampling against properties of the population that you can determine by other means than sampling.

Let's take one of the current challenges facing telephone pollsters: cellphone-only households.

Demographic information determines that more young people than older people have no landlines, doing all their communication with cells and computers. Therefore a polling method that contacts only landline telephones will sample a lower proportion of younger voters than a polling method that contacts both.

This kind of problem is a methodological flaw in the polling itself. The only really proper way to fix it is to expand the sample pool to include the people it was excluding.

An alternative but weaker and more risky solution (the risk is to loss of accuracy) is to try to weight the answers of some sampled people over others. For example, if you think your sample underrepresents younger people, you might count the answers of younger people as worth more in your total than those of older people. This is called weighting and the result produced is called a weighted value.

The other complication in polling has to do with the basic act of decision-making. A person with an opinion on an election who does not vote is less relevant to the sample than a person who is actually going to vote. Because of this pollsters consider different sample spaces:

Eligible voters (or the general population): This is everyone who can legally vote. Not commonly polled because of low turnout numbers.

Registered voters: This is a determinable category. A person who is registered to vote has done whatever is necessary to be able to actually vote if they choose to.

Voted in the last election: This is also a determinable category. The presumption here is that a person who did vote is more likely to vote than a person who only registered.

Likely voters: The holy grail of polling and a fuzzy set if ever there was one. This semi-mythic population consists of those people who are likely to do the work to go vote. Each polling firm has its own definitions of likely voters. Who is likely to vote is an incredibly complex matter, depending on how easy or hard it is to vote in the person's district, what kinds of laws are in place to challenge voters, etc.

The biggest most invisible bias that shows up in polling is the way the polls are reported, as single-number results, as if the number alone had meaning.

This is not how proper statistics are reported and not how they need to be interpreted.

The magical power of a number as answer is deceptive in the case of polls. Context matters. Methodology matters. And because elections matter it's worth the time and effort to understand the meaning of these predictors.

Don · Sep 28, 2012

RichardGarfinkle said:
2. Dishonest pollsters, sometimes called push-pollsters or lying treacherous evil propagandists, try to phrase their questions in order to elicit the answers they want to receive.

See also: Luntz, Frank.

Monkey · Sep 28, 2012

Thanks for taking the time to explain all that, Richard.

Magdalen · Sep 28, 2012

RichardGarfinkle said:
People Aren't Ping Pong Balls.

. . .
The final concern is that people take in information and use it to determine their decisions. Polls can be part of this information. Therefore, people can change their minds about things like who they are going to vote for, whether they are going to contribute to a candidate, and whether they are going to vote at all based on polling results. It may also affect how they answer the next pollster who asks them.

Yes, thanks for the info! Do you have any stats on the % of respondents who just flat out lie? I've never responded truthfully to a poll question in my life, abberant bitch that I am.

RichardGarfinkle · Sep 28, 2012

Magdalen said:
Yes, thanks for the info! Do you have any stats on the % of respondents who just flat out lie? I've never responded truthfully to a poll question in my life, abberant bitch that I am.

No, there can't be stats on how many people lie to pollsters.

In order to find out if someone lied you would either have to track their actions (which is illegal in the case of voting) or ask them if they've ever lied to a pollster and then trust the results of your second questioning.

There have also been suggestions that people should lie to pollsters. The late Chicago Tribune columnist (and godlike writing genius) Mike Royko suggested it during the 1980s. I can't find the original column but here's a link to one he wrote talking about reactions to that column:
http://articles.chicagotribune.com/...0637_1_pollsters-public-radio-show-cheap-shot

kuwisdelu · Sep 28, 2012

I'll be back later to sink my fangs into this :evil

RichardGarfinkle · Sep 28, 2012

kuwisdelu said:
I'll be back later to sink my fangs into this

I was hoping you would.

Shadow_Ferret · Sep 28, 2012

Is there a question or a discussion point? Or is this just a post about general concepts of polling?

As far as the people changing their minds, I assumed everyone understood polls are just a snapshot in time. A poll taken last week has no bearing on attitudes this week.

RichardGarfinkle · Sep 28, 2012

Shadow_Ferret said:
Is there a question or a discussion point? Or is this just a post about general concepts of polling?

As far as the people changing their minds, I assumed everyone understood polls are just a snapshot in time. A poll taken last week has no bearing on attitudes this week.

It's meant to be a general explanation of polling because it shows up so often in political stories.

Williebee · Sep 28, 2012

Richard, thanks for this.

There have been a number of insights and observations from members in the past on polling -- discussions over the relative merits of polls that just pull from landline phones, for example. It's worthwhile to try and pull the parties out and look at just the principals and process behind polling itself.

Polling - particularly political polling is fascinating to some, infuriating to others, and a mortgage payment to at least some of both groups. Polling is a multi-billion dollar industry.

Whether they are trying to spin us or honestly inform us, it is worth our time be able to tell the difference, and understand both messages.

NOTE PLEASE: This thread is not the place to argue the merits of any one particular poll, although "pulls" from past or present polls to use as an example would be appropriate. In other words, don't sleight the results of a poll just because it came from, or was funded by "that group." It is of more value to the community to cite examples of a polling question or target audience and show where or how it is faulty. Thanks!

I think we'll sticky this, at least until after the election.

Maxinquaye · Sep 28, 2012

Isn't the law of averages a particularly dangerous thing to base a sample on, given the gambler's fallacy inherent in say a simple binary roulette wheel?

I mean, if you spin a roulette wheel ten times, there is nothing that says that a particular colour will come up based on previous spins. So, if you have bet red for five times, there is no reason to believe that the next one will come up black because of a law of averages. The wheel has no memory, and the probability that the ball will land on black again is 48 % (not fifty because there is one or two green slots on the wheel, depending on which side of the Atlantic you're on).

Translate that into fuzzy alternatives and the probabilities will be even worse. I'm no mathematician though, so I'm just sort of asking those that are about this.

RichardGarfinkle · Sep 28, 2012

Maxinquaye said:
Isn't the law of averages a particularly dangerous thing to base a sample on, given the gambler's fallacy inherent in say a simple binary roulette wheel?

I mean, if you spin a roulette wheel ten times, there is nothing that says that a particular colour will come up based on previous spins. So, if you have bet red for five times, there is no reason to believe that the next one will come up black because of a law of averages. The wheel has no memory, and the probability that the ball will land on black again is 48 % (not fifty because there is one or two green slots on the wheel, depending on which side of the Atlantic you're on).

Translate that into fuzzy alternatives and the probabilities will be even worse. I'm no mathematician though, so I'm just sort of asking those that are about this.

The law of large numbers says what is likely in a large enough sample. It doesn't change the probabilities of any individual outcome. The gambler's mindset is the delusion that things have to even out in the long run for that gambler. That's untrue. What is true is that if you average the outcomes over all the bets taken over all gambles ever taken in a particular kind of fair game, the result of the average is likely to be close to the probability of winning in that game.

The larger the number of events sampled, the higher the likelihood. But there can be a lot of individual variations within some parts of that sample. One person could go for years winning pretty consistently and be no more than an outlier in the space of individual gamblers. That does not change the fact that that person's next bet has the same probability of winning as a person who's had a life long losing streak.

Priene · Sep 28, 2012

Maxinquaye said:
Isn't the law of averages a particularly dangerous thing to base a sample on, given the gambler's fallacy inherent in say a simple binary roulette wheel?

It's the opposite. The law of averages says that if your sample size is large enough then individual aberrations will be ironed out. For instance, if you decided to take an opinion poll by stopping the first person you saw, you could get unlucky and pick the leader of the local fascist party. The probability wouldn't be high, because there aren't many fascists in most countries, but it wouldn't be that low, because there are a few. But if you picked a thousand, say, the chances of any more than a small proportion being fascists would be incredibly low. As long as your sampling technique is good and your sample size high enough, your poll will provide a reasonable estimation of the views of the population as a whole.

kuwisdelu · Sep 28, 2012

I'm going to talk in general here, because lots of times we'll discuss scientific studies, too, and the same theory applies.

RichardGarfinkle said:
Because you're a statistician and this is what your life has come to -- I mean, because in the land where math problems are invented that's what people do for fun.

Ouch.

One of the fundamental theorems of probability and statistics is called the "law of large numbers" (often called the law of averages). One way to phrase what this law says for this particular situation is that the bigger the sample you take the more likely the proportions in the sample are to be very close to the proportions in the entire set of objects you are sampling from.

I wouldn't say it's often called the law of averages. That leads to confusion like Max's. The law of averages is a fallacy. It doesn't actually exist. We only need to worry about the Law of Large Numbers. To be more specific, as the sample size goes to infinity, the sample mean approaches the expected value.

There are statistical methods to discern how good your sample size is and these are used in determining margin of error.

Not quite.

Sample size is usually evaluated in tandem with statistical power. The power of a test is the probability of finding a significant difference when that difference actually exists.

With polls, you can look at power this way. You think Obama is leading Romney by at least x% and want to find out if that's true or not, using a poll of sample size n. A power of 80% means that if Obama really is leading Romney by at least x%, then at least 80% of the time you poll n people, you will detect that lead with statistical significance. (I'll talk more about statistical significance in a bit.)

With a scientific study, say you want to find out if boys are at least x inches taller than girls. Then power tells you the probability of detecting a difference of x inches in their height when that difference actually exists, any time you do a study with sample size n.

Typically, you would decide what level of power you want (80% is a typical amount) and then calculate how large of a study you need. As you might have noticed, this also requires the specification of that x amount difference that you want to detect. You typically choose that by whatever amount is a "meaningful difference" to you, and that can vary a lot.

If your Obama's advisors and you want to do a poll to see how far ahead or behind he is in a state (maybe to decide whether you want to spend more campaign money there or not), then a lead of 1% probably isn't very meaningful to you, because 1% isn't a very secure lead and could change tomorrow. Maybe you want to detect a lead of 10%, because maybe that's solid enough that you feel comfortable spending a little less campaign money in that state. Then you would decide what level of statistical power you're comfortable with, and calculate the number of people you need to poll to be confident in your results.

Now if you're doing a scientific study where you're measuring a quantitative amount, then you also need to get an estimate of the variability in the population. If most boys are pretty much the same height, and most girls are pretty much the same height, then you can get away with a much smaller sample size than if their heights vary a lot. You often get this estimate with a pilot study.

The sample variation is an estimate of the population variation, and it gets better with a large sample size because of the Law of Large Numbers that Richard talked about. You combine this estimate with the sample size to calculate your standard error, which often gets turned into...

Third, there's the fact that even if you do get a decent sample size the law of large numbers says its unlikely, but not impossible, that you'll get a weird result. That's why sampling once is not enough.

Margin of error and the law of large numbers are both critical in understanding any sampling result. If a result says 50% +/- 3% it means that it is most likely that the answer lies somewhere between 47% and 53%.

Suppose you got the following results in five samplings with a margin of error of 5%:

45%, 39%, 60%, 44%, 42%.

The best guess is that the 60% is an aberration and the others are all within the same range, so probably the true value is somewhere around the low 40%.

But suppose you saw this as a graph. It would have a big dip between the first and second values, then a jump up to 60 then a fall back down. The temptation would be to tell a story of massive volatility in results when in fact they all fit neatly into a single statistical result.

...unfortunately, rather than actually reporting the standard error, we get told the margin of error. The problem is, we don't know what confidence level they're talking about, but it's almost always 95%.

So the standard error (in the simplest cases) is sample standard deviation divided by the square root of the sample size. The standard error is an estimate of the standard deviation in the distribution of your sample means. So basically, if you did this study of sample size n many, many times and reported the mean, then took the standard deviation of those means, it should be close to the standard error.

You use the standard error when doing hypothesis tests or calculating confidence intervals. Polls usually report 95% confidence intervals (although they don't usually say that). You are 95% "confident" that the "true value" lies within the limits (the given value plus or minus the "margin of error") reported. This doesn't actually mean there is a 95% probability that you are "right." What it actually means is that if you repeated this study of sample size n many, many times, 95% of the time you would get a result within those limits.

All of this is possible thanks to a cool thing called the Central Limit Theorem. That says that no matter what the underlying population looks like, the distribution of means always approaches a normal distribution (that's that nice "bell curve" you always hear about). No matter how skewed the girls' heights are, if you sample a bunch of girls and take the mean of their heights, and repeat that many, many times, the distribution of those means will be approximately normal. (That's just a little insight into how we build those confidence intervals and why it works.)

One thing worth noting is that polling is a bit special from most studies, because you're looking at proportions rather than measuring a quantitative thing like heights. Because of the Central Limit Theorem, we can still use all of the exact same tools to build confidence intervals, but standard errors are calculated a little differently. With proportions, the standard error depends the sample size and what the proportions actually are. If a race is closer to 50-50, then your error is going to be higher than if it's 40-60, for the same sample size. That's a bit unique to polls, and because we know what this relationship looks like, it's easier to do sample size and power calculations.

ETA: I forgot to talk about statistical significance.

So you'll hear a lot about statistical significance in scientific studies. Less so with polls, because you usually get confidence intervals there. So let's talk about hypothesis tests for a moment.

Say you want to test whether boys' and girls' heights are the same or not. Then "boys and girls are the same height, on average" is the null hypothesis, and "boys and girls heights are different, on average" is the alternative hypothesis. So you do your study and you calculate the difference between the boys and girls average heights, and you calculate your standard error. Now if there is no true difference in the mean height of boys and girls, then we know due to the Central Limit Theorem that as the sample size approaches infinity, the distribution of the mean difference of their heights should approach a normal distribution centered at zero with a standard deviation approximately equal to the standard error you calculated. You can use this knowledge to set a threshold where, if the mean difference is larger than that threshold, then you can say that boys and girls heights really are different. Because if there weren't a difference, then you would only observe that large of a difference, say, 5% of the time you did a study with the same sample size. That 5% is the statistical significance. It means that if there's really no difference, you would observe the difference that you did only 5% of the time, so that's sufficient evidence to say that there probably is a difference. You can set the % at whatever level is comfortable for your study, and the level of statistical significance you want determines the threshold you use.

kuwisdelu · Sep 28, 2012

RichardGarfinkle said:
In order for sampling to be accurate it must accurately represent the population being sampled from.

This is the really important thing. With polls, sampling technique tends to be far, far more important than sample size. If you have a representative random sample, you really don't need that large of a sample to get good results. If you have a biased sample, you're already screwed and sample size doesn't matter. Too often on the internet, people will criticize sample size without really understanding what they're talking about enough to know whether the sample size was sufficient or not. Knowing what the target population is and how the sample it is key. But even with a biased sample, you can get meaningful results... you just can't extrapolate them to a different population.

And more generally, with studies that are concerned with quantitative results, you need to consider the variability in the population before you can evaluate sample size. As I mentioned earlier, if boys are pretty much all the same height and girls are pretty much the all same height, then you don't need a big sample to detect a difference between the two. Likewise, if the "meaningful difference" is significantly large, then you won't need as big of a sample. If you only care about a 10% or more lead, you don't need as large of a poll. It's important to understand what you're trying to measure.

RichardGarfinkle · Sep 28, 2012

Kuwisdelu, thanks for clarifying all that, and sorry about the statistician joke.

kuwisdelu · Sep 28, 2012

Feel free to ask questions, class.

Do you want to learn about surveys and Likert scale data?

Or we could talk regression and ANOVA.

I can make graphs!

rugcat · Sep 28, 2012

RichardGarfinkle said:
Kuwisdelu, thanks for clarifying all that, and sorry about the statistician joke.

Sadly, clarifying sometimes is a synonym for "eyes glazing over and brain hurting."

Also sadly, an example that very few things in science (or life) are as simple as we like to think they are.

Haggis · Sep 28, 2012

rugcat said:
Sadly, clarifying sometimes is a synonym for "eyes glazing over and brain hurting."

Also sadly, an example that very few things in science (or life) are as simple as we like to think they are.

I protected my brain. I simply pretended to read all that stuff above, and nodded my head occasionally while wiping away the drool.

kuwisdelu · Sep 28, 2012

rugcat said:
Sadly, clarifying sometimes is a synonym for "eyes glazing over and brain hurting."

Also sadly, an example that very few things in science (or life) are as simple as we like to think they are.

I can explain it more simply, but it would be more verbose, and I'm not sure where to start. If there's a particular part you'd like to see simplified and explained further, I'd be happy to try.

This is easier to explain in person, where I have direct feedback.

Romantic Heretic · Sep 28, 2012

Thanks Richard and kuwisdelu. A fascinating subject. I'll becoming back to this thread again to see if I understand this. I'm not sure at the moment.

kuwisdelu · Apr 10, 2013

You know you're a statistician when you wish you could correct every science journalist who uses the words "probability," "odds," and "likelihood" interchangeably.

They're different things, people!

(FYI, 9 times out of 10, the word you want is "probability".)

Williebee · Apr 10, 2013

Really? What are the odds of that?

Polling

Nurture Phoenixes

Nurture Phoenixes

Nurture Phoenixes

All Living is Local

Is me.

Petulantly Penitent

Nurture Phoenixes

Revolutionize the World

Nurture Phoenixes

Court Jester

Nurture Phoenixes

Capeless, wingless, & yet I fly.

That cheeky buggerer

Nurture Phoenixes

Out to lunch

Revolutionize the World

Revolutionize the World

Nurture Phoenixes

Revolutionize the World

Lost in the Fog

Evil, undead Chihuahua

Revolutionize the World

uncoerced

Revolutionize the World

Capeless, wingless, & yet I fly.