Close this window to return to the main page.
The statistical distinction between populations and samples is basically the distinction between all and some. If you were to observe all instances of some particular category of items or events, your observations would be of an entire population. If you were to observe only a limited subset of the instances of that category, and take those observations as representative of the entire
set of instances, your observations would constitute a sample.
Here, for example, is a list of the scores achieved by 12 students on the first exam in a statistics course, arranged in order from lowest to highest.
61, 69, 72, 76, 78, 83, 85, 85, 86, 88, 93, 97
If this list came from a class that had exactly 12 students in it, and if we were interested in nothing other than how this particular set of 12 students performed on this particular exam, then what we would be dealing with is the entire population of the instances in which we happen to be interested. Suppose, however, that the class actually has 60 students in it, and that the instructor, after grading the first 12 exams, takes a moment to examine the distribution of this limited subset of scores in order to get a
sense of how the 60 students in general might have performed on the exam. In this case, the
exam scores of all 60 students (even though most of the exams have not yet been graded) would constitute the population, and the 12 exam scores of the subset would constitute a sample that the
instructor takes to be representative of this population. Alternatively, suppose that the instructor, who plans to give slightly modified versions of this particular exam to other classes over the next several years, examines the distribution of all 60 exam scores in the current class to get a sense of how students in general will do on the exam in subsequent years. In this case the 60 exam scores are not a population, but rather a sample taken to be representative of a considerably larger population, namely, the as-yet nonexistent scores of an as-yet undetermined number of students who will be taking the exam in subsequent editions of the course.
In brief: a population includes all instances of the particular category of items or events in which you happen to be interested, irrespective of whether all these instances have actually been observed; while a sample includes only a limited subset of the population, selected in such a way as to insure that it is representative of the totality of the population. And while we are at it, please note that this technical statistical sense of 'population' is not limited to populations of people. You can also have populations of cats, rats, mice, lice, jonquils, junipers, paramecia, erythrocytes, galaxies, water molecules, hydrogen atoms, and electrons. Indeed, any category of empirical fact at all can be thought of as constituting a population, providing that the reference is to all instances of it. Thus, you could speak not only of the population of tree squirrels in a locality, but also of the population of tree squirrel nests, or of the populations of heights or sizes of tree squirrel nests.
Close this window to return to the main page.