Estimation and Margins of Error

One clear lesson of recent history is that the US presidential candidate who wins the national popular vote does not necessarily win the election. Owing to the US Constitutional provision of the Electoral College, it is possible in a very close election that a candidate might narrowly win the popular vote yet lose the election in the Electoral College. (See Wikipedia: Electoral College.) The observations that follow pertain only to the ability of candidate preference polls to forecast the national popular vote in an election.

The main point of these observations is that the results of such polls, especially in a close election, must be taken with a grain of salt. The following table shows the results of polls conducted by three major polling organizations during the week just prior to the US presidential election of 2000. In each case, the percentage of the national popular vote predicted by the poll for each candidate is displayed next to the percentage that was actually observed in the election. The final column shows the difference between the two, calculated as Predicted minus Observed.

Polling
 Organization 

 Candidate 
Percent
 Predicted 
by Poll

Percent
Observed
 in Election 

 Difference 
Zogby
Gore
48%
48.4%
-0.4%
Bush
46%
47.9%
-1.9%
Other
6%
3.7%
+2.3%
Harris
Gore
47%
48.4%
-1.4%
Bush
47%
47.9%
-0.9%
Other
6%
3.7%
+2.3%
Gallup
Gore
45%
48.4%
-3.4%
Bush
47%
47.9%
-0.9%
Other
8%
3.7%
+4.3%

The Zogby poll correctly predicted that Mr. Gore would win the popular vote, though its projected 2% margin of victory was much greater than the 0.5% margin that actually occurred. At the other extreme, the Gallup poll predicted that Mr. Bush would win the popular vote by an equally comfortable 2% margin, which would have amounted to a margin of about two million votes, whereas he actually drew about half a million votes fewer than Mr. Gore. In the middle was the Harris poll, which correctly projected that candidates Gore and Bush would each receive about the same percentages of the popular vote, though in both cases it underestimated what these percentages would be. Notice that all three polls substantially overestimated the percentage of the vote that would go to "Other."

So even when these polls are conducted within just a few days of the election, they must be taken with a grain of salt; and the name of that grain is margin of error.

To illustrate this concept against a simple, uncluttered backdrop, let me take you back twenty years to the presidential election of 1988. The candidates of the two major parties were Mr. Bush (père), the Republican, and Mr. Dukakis, the Democrat. A few days prior to the November election, a certain poll of N=1100 likely voters found that 55% of the persons sampled expressed a preference for Mr. Bush, while only 45% leaned toward Mr. Dukakis.
Gallup Poll 1988
55%

45%

Bush
Dukakis
Margin of Error
±3 Percentage Points
Given that the election was at this point only a few days away, the pollsters confidently projected that Mr. Bush would win by a popular vote of approximately 55%, with a "margin of error" of plus or minus 3 percentage points.

Unless otherwise indicated, the phrase "margin of error" in this context refers to a 95% level of confidence. Thus, with x% of the respondents in a poll favoring Candidate X and a margin or error of ±3%, the pollsters are 95% confident that the percentage favoring Candidate X within the population falls somewhere within plus or minus 3 percentage points of x%. By the same logic, they are acknowledging a 5% chance that the population percentage might actually be more than 3 percentage points distant from x%, in either direction. Also playing into the process are considerations of timing and of the adequacy of the sample on which the poll is based.

So strictly speaking, the pollsters' projection in the 1988 election was this: If nothing happens between now and then to incline voters differently; and if this sample of N=1100 is a fair representation of the general population of persons who are registered to vote and will actually bother to do so; and if the conventional canons of statistical inference are to be trusted—then we can be 95% confident that the popular vote for Mr. Bush (père) will fall somewhere between
55%+3%=58%
and
55%3%=52%
As it turned out, 1988's candidate Bush ended up with 53.4% of the popular vote, which fell comfortably enough within this range of 55%
±3%. Do note, however, that while a 1.6% difference between the projected and observed percentages was of no great significance in this particular election, it could easily spell the difference between winning and losing in a close election.

The first of these ifs—"if nothing happens between now and then"—is a very big one indeed, and its iffy-ness of course increases in proportion to the time remaining between the poll and the actual election. The second if is is not unique to political polling. As in all other cases where one is seeking to estimate the properties of a population on the basis of a relatively small sample, the bedrock assumption is that the sample is an unbaised representation of the population. For political polling, this entails that the composition of the sample must faithfully reflect that of the population with respect to gender, age, race, ethnicity, socio-enonomic level, geographical region, party loyalty, and all other variables that might play a substantial role in determining voter preference. Given the relatively small sample sizes on which political polls are typically based, vis-à-vis the size and diversity of the population, I suspect this assumption is not always as fully satisfied as one might wish. However, I do suppose that professional pollsters undertake to satisfy it as fully as might be practicable, under the circumstances.

At any rate, once you get past this bedrock sampling requirement, all the rest of it comes down to some fairly simple principles that any student of statistics could easily enough grasp after the first few weeks of his or her study of the subject. Suppose there is a population of likely voters that includes x% who favor Candidate X at a particular moment in time. Within any particular sample randomly drawn from that population, the percentage of respondents favoring Candidate X will tend to approximate x%; and, the larger the size of the sample, the closer that approximation will tend to be. The percentage of respondents favoring Candidate X within the sample can therefore be taken as an estimate of the corresponding percentage within the population, with a margin of error inversely related to the size of the sample.

At the intuitive level, the clearest indication of a need for the concept of "margin of error" is found in the fact that different polls, conducted at approximately the same time and presumably drawing from the same population, can end up with rather different results. There are chiefly two kinds of factors that might account for these differences. The first involves a whole nest of questions concerning the polling process itself: Who conducted the poll? Who paid for it? How were the respondents selected? What kinds of questions were asked? And so forth. [In this connection, see Twenty Questions a Journalist Should Ask about Polls.] The second factor is one that pertains to any situation at all, polling or otherwise, in which one is seeking to assess the properties of a vast population on the basis of a relatively small sample. Even if all the polls in question had followed the very same sampling procedures and asked the very same questions in precisely the same way, there would still be discrepancies among their results by virtue of sheer, cusséd random variability.

The statistical concept of "margin of error" has no bearing at all on the first of these factors. A small margin of error is no guarantee that the sample was a fair representation of the population or that the poll was conducted even-handedly. The sole reference of "margin of error" is to the amount of sheer random variability that is intrinsic to the polling process in any particular instance.

To illustrate the concept, I have created inside your computer a vast population of virtual voters. In the first cell of the following table you can set the percentage of the population that favors Candidate X to any value you wish (e.g., 45, 53, 62); and in the second cell you can set the size of the samples to any value you wish (e.g., 10, 600, 1100). Each time you click the button labeled "Samples" your computer will draw 20 random samples from the population, each of size N, and display the percentages of respondents within the samples who favor Candidate X. For a hands-on acquaintance with the concepts of "sampling" and "margin of error," you might find it useful to spend a few minutes playing around with different values for population percentage and sample size. The essential observation is simply this: The larger the sample size (N), the more tightly the percentages within the samples will tend to cluster around the stipulated population percentage; hence, the larger the size of any particular sample, the smaller the "margin of error" when the percentage within that sample is taken as an estimate of the percentage within the population.

Percentage within
the population who
favor Candidate X
=
%
Percentages within each
of the 20 samples who
favor Candidate X

Sample size:  N
=







If you were to draw a very large number of such samples and keep track of all the individual sample percentages, you would find them assembling into a very regular and highly predictable pattern. In most real-life professional polling situations, where the population percentage is likely to fall somewhere between 30 and 70, and where the sample size is likely to be at least several hundred, this pattern becomes a very close approximation of the normal distribution, as depicted below, with a mean equal to pct, the percentage of voters within the population favorable to Candidate X, and a standard deviation equal to


  p = pct/100
  N = sample size

As in any normal distribution, 95% of all cases will cluster within 1.96 standard deviations on either side of the mean. Thus, with a population percentage of pct=50 and samples of size N=1000, the standard deviation of sample percentages would be
±1.58, entailing that 95% of all sample percentages would fall within the range bounded by
50(1.58x1.96)=46.9 at the lower end
and
50+(1.58x1.96)=53.1 at the upper.

The following mini-calculator will work out these numerical details for any value of pct between 30 and 70, and for any value of N>100. Enter the values of pct and N into the bottom two cells, then click the "Calculate" button. [In the graph, "SD" is an abbreviation for "standard deviation."]







lower
limit
mean of
distribution
upper
limit

percentage in population =
sample size =



The same logic, approached from the opposite direction, applies to those real-life situations where we do not know the population percentage in advance. If the percentage favoring Candidate X within a sample has a 95% chance of falling within a certain distance of the population percentage, then the population percentage, even when it is unknown, also has a 95% chance of falling within that same distance of the percentage found within the sample. Hence

estimated
population
percentage
 = 
observed
percentage
in sample
 ±margin of error


The Difference Between x% and y%

The obvious consequence of this construction is that any estimated population percentage is actually a range whose width is twice the size of the margin of error. Somewhat less obvious is the implication this has when comparing the estimated percentages for two candidates.

Suppose, for example, that a poll of size N=1000 shows 49% of the respondents favoring Candidate X and 46% favoring Candidate Y, with 5% going to "other" or "undecided." Notwithstanding the delight or despair that these percentages might evoke in the camps of X and Y, they do not constitute unequivocal evidence that Candidate X is "running ahead" of Candidate Y. Using one of the calculators given on the «Calculators» page, you can determine that each of these sample percentages, when taken as an estimate of the corresponding percentage within the population, has a margin of error of ±3.1%. The following graph illustrates the respective 95% confidence intervals for the X and Y estimates, along with the substantial degree to which they overlap.



Given this overlap between the estimates, it is entirely possible that X and Y are actually running "neck and neck" within the general population, or even that Y is actually "running ahead" of X!

Here again I will illustrate the point with a hands-on demonstration. That vast population of virtual voters inside your computer has now been redesigned so that the preferences for Candidates X and Y are exactly 50% each. Each time you click the button labeled "Samples," your computer will draw 10 random samples from the population, each of size N, and display the percentages of respondents within the samples favoring X and Y, respectively. You can set the sample size to any value you wish (e.g., 10, 650, 1100). As you click out your samples with various values of N, note that the percentages for X and Y within individual samples will rarely come out at exactly 50% each. In about half the cases, X will be greater than Y; in the other half, Y will be greater than X; and sometimes the difference between the two, by the merest chance, will be fairly large. In general, the smaller the sample size, the greater the mere-chance differences between X and Y are likely to be.

Preferences within
the population:
  50% for X
  50% for Y
Percentages within each of
the 10 samples who favor
X(left)__Y(right)

Sample size:  N
=





Based on these and other relevant statistical principles, the calculators provided on the «Calculators» page will perform various assessments of the results of such political polls as you might encounter during the current election season.



©Richard Lowry 2008
All rights reserved.





Standard Deviation
For most purposes of statistical inference, the two main properties of a distribution are its central tendency and variability. Central tendency refers to the tendency of the individual measures in a distribution to cluster together toward some point of aggregation, while variability describes the contrary tendency for the individual measures to disperse or spread out away from each other. The most generally useful measure of central tendency is the arithmétic mean. For variability it is either the variance or the standard deviation, depending on the context. (Variance and standard deviation are related to one another as square and square root.) If you have only just begun the study of statistics, you can think of the standard deviation as a measure of the average degree to which the individual measures within a distribution differ from their collective mean. This is not precisely what it is, though it will do for the moment. A fuller description of these matters can be found in Chapters 1 and 2 of Concepts and Applications . . ..

Normal Distribution
The normal distribution is an abstract mathematical structure that first arose in the eighteenth century in connection with the attempt to specify the probabilities, or odds, that are involved in certain games of chance. At first it was purely theoretical and of no particular interest to anyone apart from gamblers and mathematicians. But with the passage of time it became increasingly clear that the general shape of this theoretical abstraction is closely approximated by the distributions of a very large number of real-world empirical variables. The utility of it is that, once you know a distribution to be normal, or at least a close approximation of the normal, you are then in a position to specify the mere-chance probability associated with any particular point in the distribution.
[Return to main text]