|We've all seen media polls. What we don't
often see are the details of the polling apparatus; most
significantly "Who and how many are polled?" A
second issue is hidden from view. . .polling reliability
(or confidence). Some of these issues are discussed
We take polls for one big reason: collecting data from everyone is too time consuming and (most importantly) too expensive. In a perfect world, poll-taking would be simple. In the real world it is not. Let's discuss the perfect world first.
In the perfect world we'd put the names of all voters in a great big hat, shake the hat up, and draw names that would constitute the sample. These people would be contacted and surveyed; their responses would lead to the "results" (something like: Clint Billon has 54% of the vote with a polling error of ± 3%). If you've followed election polling, one thing you learn is that each poll seems to give a different result. This is due to the randomness of the selection process.
There are n sampled people. Suppose X of them have some property (such as "will vote for Clint Billon"). We estimate the fraction of all people having this property with
ESTIMATE = X ÷ n
Multiply by 100% to obtain a percentage.
Using this method, and operating in this perfect world, the polling error is related to the sample size n through (approximately)
POLLING ERROR = 1 ÷ SQRT(n)
where SQRT stands for "square root of." Again, multiply by 100 to obtain the value in % form. (This "recipe" or "formula" should only be used when the fraction is somewhere near half -- 50%. There are adjustments for other cases). So if the sample size is n = 100 then the polling error is approximately ± 10%. If n = 1000 we get a polling error of approximately ± 3.2%. In fact, it takes about 1000 people sampled in this fashion (drawn from the big hat) to achieve a margin of error of approximately ± 3%.
So, suppose we sample (randomly, of course) 1000 voters; 541 favor candidate Clint Billon. That's 54.1% with a polling error of ± 3.2%. It's always good to look at the endpoints of the implied interval: 54.1 - 3.2 = 50.9, 54.1 + 3.2 = 57.3. So, we might also state our interval as the range from 50.9% to 57.3%. Of course, if we are the media, we round these values off to avoid scaring those readers who don't like decimals: 51% to 57%. (Actually, the results should be stated to the nearest 0.1; however, the media errs in the proper direction. That's another discussion for another page.)
Then, what does this 51% to 57% mean? There's no uncertainty about the polled people; the polling error must reflect something about those people left unpolled (almost everyone it turns out). Most people understand that the range of values given by the poll estimates a percentage of all voters. That is, the pollsters are telling you that between 51% and 57% of all voters will vote for Billon.
The best question you might ask yourself is: How can they be so sure? After all, if 60% of all voters intend to vote for Billon, it certainly seems possible (if improbable) that 54% of the sampled voters would be for Billon. If this has happened, then the poll is in error. How can they be so sure?
They're only 95% sure! This (95%) is the number that nobody ever tells you about. There's no way of knowing (before the fact) whether the poll result (including margin of error) is correct or not. In our example, we've got a successful poll IF the percentage of ALL voters favoring Billon is between 51% and 57% (within 3% of 54%). Otherwise we have a failed poll. It's impossible to determine which type of poll we have. However. . .and this is the point. . .it is the case that 95% of all polls conducted in this fashion are successful ones. The 95% refers to the reliability -- or statistical confidence -- of the method that produces the final results (54% ± 3%). The choice of 95% reliability is due mainly to historical factors.
For instance, I have 20 pennies in my pocket, 19 (95%) are U.S, the other is Canadian. I mix up the coins and pull one out of the pocket. I look at it; you do not. You feel the same way about that penny being U.S. as you do about the results of any (media) election poll. That is: the poll either is or is not correct (the penny either is or is not U.S.), but in 95% of all cases it is correct (in 95% of all cases I will have a U.S. coin in my hand).
To test your understanding, here's a little assignment!