degrees of freedom
sampling distributions of chi-square
logic and computational procedures
Degrees of Freedom
In working to understand the meaning
of this concept, your first step should be to disentangle it entirely
from most of the meanings that you normally associate with the
word "freedom," since in this context it has nothing
at all to do with freedom of the will, freedom of conscience,
or anything of the sort. Degrees of freedom, df, is simply
an index of the amount of random variability, mere chance coincidence,
that can be present in a particular situation. Its closest literal
translation would be something along the line of "degrees
of arbitrariness."
Here is a working definition of the concept
as it occurs within the context of chi-square procedures. Suppose
you have two cells such as the ones shown below, and you are free
to plug any integer numbers that you want into them, subject only
to the stipulation that the sum of the two numbers must be equal
to a certain specified quantity. For purposes of illustration
we will set the sum at 20, though it could actually be any positive
integer value.
It will be obvious at a glance that your "freedom"
in this case is limited by the fixed sum of 20. If you start by
plugging the integer 8 into cell a, the number that goes
into b is then inescapably fixed as 20 - 8 = 12. Start
by plugging 16 into cell b and the number that goes into
a is then rigidly fixed as 20 - 16 = 4.
So in this two-cell situation, only one of the cells is "free"
to vary arbitrarily-which is to say, there is only one degree
of freedom. If you have three cells subject to a fixed sum of
20
you can plug numbers arbitrarily into any
two of the cells, but once those cells are plugged the value of
the third is rigidly fixed. Thus, plug 6 into any one cell and
8 into either of the remaining two, and the value of the third
is then fixed as 20 - (6+8) = 6. So here,
with three cells, your degrees of freedom would be equal to two.
The same logic then extends to cases where the number of cells
is four, five, six, and so on. When applying chi-square procedures
to situations in which there is only one dimension of categorization,
the general principle for determining degrees of freedom is
df = (number of cells) - 1
With two dimensions of classification, it
is a different formula but the same basic logic. Suppose you have
two rows and two columns of cells, as shown below, and you are
free to plug any integer numbers that you want into them, subject
only to the stipulation that the cell values summed across rows,
down columns, and overall, must add up to the fixed row, column,
and overall sums shown in bold-face type.
| 50 | |||
| 40 | |||
| 90 |
Here again your "freedom" is limited
by the fixed sums. Arbitrarily plug 10 into cell a and
the remaining three cells are instantly fixed at b=40,
c=38, and d=2. Plugging 20 into cell d fixes
the other cells at b=22, c=20, and a=28;
and so on. For chi-square situations involving two rows and two
columns of cells, degrees of freedom will in every instance be
equal to one.
If you have two rows and three columns, your "freedom"
is increased a bit, though still limited by the fixed sums.
| 50 | ||||
| 40 | ||||
| 90 |
If you plug 10 arbitrarily into cell a,
cell b becomes fixed at 15, but the other four cells remain
"free" to vary. Plug some other number into any of these
remaining four cells, however, and everything else becomes instantly
fixed. For chi-square situations involving two rows and three
columns (or three rows and two columns), degrees of freedom is
in every case equal to two.
More generally, when applying chi-square procedures to situations
in which there are two dimensions of categorization, the general
principle for determining degrees of freedom is
df = (number of rows - 1) x
(number of columns - 1)
For more on one-dimensional and two-dimensional chi-square tests,
go to logic and computational procedures
[Return to top.]
[Return to Main Menu]
[If you came to degrees of freedom from logic and procedure,
click here to get back to where you
were.]
Sampling Distributions of
Chi-Square
The first of the following graphs
shows the general outlines of the sampling distributions of chi-square
for df = 2, 3, and 4, while the second shows the outlines
of the df = 2 distribution in closer detail. You will find
it useful to compare each of these graphs with the entries that
appear in the table of critical values of chi-square.


For more on sampling distributions of chi-square go to logic and computational procedures
[Return to top.]
[Return to Main Menu]
Logic
and Computational Procedures
As indicated in class, the logic
of chi-square (pronounced kai to rhyme with sky)
flows directly from the logic of binomial probabilities. Binomial
procedures apply to situations where there are exactly two mutually
exclusive categories into which observations might fall-female/male,
head/tail, recovery/non-recovery, and so on. Chi-square extends
the logic of binomial procedures to cover situations where there
are more than two categories of possible outcome; for example,
students categorized according to academic class as freshman/sophomore/junior/senior
or patients with a certain disease categorized according to whether
their condition, following an experimental treatment, is improved,
worsened, or unchanged. Chi-square procedures also extend this
logic to cover situations where there is more than one dimension
of classification; for example, students categorized according
to whether they describe themselves as "conservative"
or "liberal," as well as by their academic class as
freshman, sophomore, junior, or senior. When the observed items--students
or whatever else they might be--are categorized in this fashion
according to two or more separate dimensions of classification
concurrently, they are said to be cross-categorized. In
both kinds of cases, the chi-square test is used to determine
whether an observed pattern of frequencies significantly
differs from the pattern of frequencies that would be expected
if nothing other than random variability (a.k.a. mere chance coincidence,
sampling error) were operating in the situation.
[As my HTML formatter does not support mathematical notation mixed
in with text, I will be representing a value of chi-square here
as chi2.]
Chi-Square with One
Dimension of Classification
Suppose that a questionnaire administered
to a large national sample of college students included an item
aimed at measuring conservatism versus liberalism on a certain
issue of social/political relevance. The item took the form of
a statement, and the response categories were "strongly disagree,"
"moderately disagree," "undecided," "moderately
agree," and "strongly agree." I will leave it to
your imagination to fill in the blanks concerning what the statement
was and which end of the response scale was taken to reflect "conservative"
or "liberal" attitudes. Suffice it to say that in the
national sample 9.4 percent of the respondents expressed strong
disagreement, 15.6 percent moderately disagreed, 34.3 percent
were undecided, 27.5 percent expressed moderate agreement, and
13.2 percent strongly agreed. Professor H, upon reading the
results of this survey, suspects that the students at her particular
college are considerably more polarized into "conservative"
and "liberal" camps, in comparison with the more general
population of students studied in the national sample. To test
this hypothesis, she administers the same question to a random
sample of 204 students at her college and then compares the pattern
of responses to the pattern of the national sample.
The null hypothesis in this situation is that
the responses of Professor H's 204 respondents should not
differ significantly from the 9.4%/15.6%/34.3%/27.5%/13.2% pattern
of the national survey. Thus, the MCE expected frequency
of response in the "strongly disagree" category would
be .094 x 204 = 19.18; for the "moderately disagree"
category, .156 x 204 = 31.82; and so on. What Professor H
actually found, however, was something that seemed to fit rather
well with her suspicion concerning polarization. All that remained
was to determine whether the difference between the two patterns
was significant. Here is an overview of the observed counts and
percentages for the five response categories, each in comparison
with the corresponding MCE expected values, along with
the steps required for calculating chi2.
| Total | ||||||
|
|
|
|
| 204
... | |
|
|
|
|
| 204
... |

The following graph shows the sampling distribution
of chi-square for the case of df=4. As indicated in the
graph, our calculated cs2_observed value of 12.34 is
significant not only at and beyond the minimal .05 level, but
also beyond the level for .02. Hence P < .02. In brief,
we can be a shade more than 98 percent confident that the difference
between the observed and MCE expected patterns of frequencies
does not result from mere random variability.

Do keep in mind, however, that the chi-square test we have just
performed is intrinsically non-directional. In and of itself,
the significant chi-square value says nothing at all about the
particular texture or direction of the difference. Examine the
above array of data in detail, however, and you will see that
the texture of the difference is consistent with Professor H's
hypothesis concerning greater conservative-liberal polarization
among the students at her college. In particular, there were fewer
respondents in the "undecided" category than would have
been expected on the null hypothesis, and more in the "strongly
agree" and "strongly disagree" categories. These
three cells, in fact, accounted for all but a small fraction of
the calculated chi-square value of 12.34.
Chi-Square with Two
Dimensions of Classification
This is one of the examples given
in class, based on data reported in the 17 August 1996 issue of
the British medical journal Lancet. Researchers at Columbia-Presbyterian
Medical Center (NYC) sorted the subjects in a sample of 1,124
elderly women according to two dimensions of classification: (1)
Was the subject receiving estrogen-replacement therapy (ERT) at
any time during the preceding ten years? [yes/no] and (2) Did
the subject develop clinically diagnosable indications of Alzheimer's
disease at any time during the preceding five years? [yes/no]
[Note: none of the subjects showed signs of Alzheimer's at the
beginning of the five-year period.] Their research hypothesis
(H1) was that ERT might be of some benefit in preventing
or postponing the onset of Alzheimer's symptoms; hence, that the
subjects receiving ERT should show a smaller percentage of Alzheimer's
onset during the five-year period in comparison with the subjects
who did not receive ERT. The null hypothesis (HO) was
that the percentages of Alzheimer's onset within the two groups
should be the same, within the limits of random variability. And
here are the data as reported, cross-categorized according to
the two dimensions of classification. You will see that the observed
results are consistent with H1. All that remains is
to determine whether the observed difference--5.77% among the
ERT subjects versus 16.01% among the non-ERT subjects--is significant.
| Totals | |||
|
| ||
|
| ||
| Totals | |||
The procedures for applying
chi-square to a two-dimensional situation of this sort are the
same as we saw for the one-dimensional situation, except for two
small modifications. The first has to do with what we take to
be the MCE
expected cell frequencies, and the second pertains to how we determine
the appropriate value for degrees of freedom. In both of these
modifications the underlying logic is the same; it is only the
details that differ. In this section we will consider only the
determination of the expected cell frequencies; to go to a separate
discussion on degrees of freedom for a chi-square test, click
here: ===> degrees of freedom.
In the one-dimensional chi-square test the values of E
are simply stipulated in advance. In the student survey example
they were set to match the proportions of responses in the various
categories that had been found in the large national survey. In
the two-dimensional situation, on the other hand, the expected
cell frequencies are not given in advance, nor are they intuitively
obvious. I will illustrate the logic of the point with a simple
if somewhat fanciful example. Two friends, A and B, believe their
friendship to be so deep as to produce a remarkable pattern of
correspondences. Often they find that they are thinking the same
thing at the same time. Often, even when separated, they find
that they are doing the same thing at the same time. To put their
faith to the test, they each toss a coin 100 times in succession,
recording on each occasion the head/tail outcome of A's toss and
the corresponding head/tail outcome of B's toss. Their hypothesis
is that when A gets a head, B will also tend to get a head; and
that when A gets a tail, B will also tend to get a tail. They
of course do not suppose that even their relationship is so deep
as to insure these correspondences in 100 percent of the tosses,
though they do believe that the pattern of such correspondences
will significantly exceed what would be expected on the basis
of mere chance coincidence.
For the sake of discussion, suppose that A
and B each end up with exactly 50 heads and 50 tails. In that
case the contingency table would have the following marginal totals,
irrespective of how much or little the heads and tails outcomes
of A and B might correspond.
| 50
(total heads for A) | ||||
| 50
(total tails for A) | ||||
|
| 100
(total number of paired tosses) | ||
Actually, this is one of the rare scenarios
for which the values of E would be fairly intuitively
obvious. If you were asked to guess the values of the MCE
expected frequencies for cells a, b, c, and
d, I expect you would answer 25/25/25/25--and this would
be quite correct. Now, if only you can make explicit the hidden
logic that leads you to this answer, you will have the procedure
for figuring out the values of E for two-dimensional chi-square
situations in general. I suspect the core of your implicit logic
runs something like this: If A gets 50 percent heads and
B gets 50 percent tails; and if nothing other than mere
chance coincidence is operating in the situation; then
the (conjunctive) probability that any particular one of the 100
paired tosses will include a head for A and a tail for
B is .5 x .5 = .25. Thus, the expected frequency for cell a
is 25 percent of the total number of paired tosses: Ea
= .25 x 100 = 25. The same logic would also lead you to Eb
= 25, Ec = 25, and Ed = 25.
Now see how the same logic can be extended
to situations that are not intuitively obvious. When A and B perform
their series of 100 paired tosses, it is actually not very likely
that they would both end up with exactly 50 percent heads
and 50 percent tails. It is much more likely that they each would
end up with something slightly different from an exact 50/50 split.
Suppose that A comes out with 46 heads and 54 tails, while B ends
up with 48 heads and 52 tails. In this case the marginal totals
would be distributed as follows
| 46 | ||||
| 54 | ||||
| 100 | ||||
and the logic would run like this. It is the
same logic, word for word, as outlined above, except for the different
numerical values that get plugged into it: If A gets 46
percent heads and B gets 52 percent tails; and if nothing
other than mere chance coincidence is operating in the situation;
then the (conjunctive) probability that any particular
one of the 100 paired tosses will include a head for A and
a tail for B is .46 x .52 = .2392. Thus, the expected frequency
for cell a is 23.92 percent of the total number of paired
tosses: Ea = .2392 x 100 = 23.92. By the same
reasoning you would also arrive at Eb = 22.08,
Ec = 28.08, and Ed = 25.92.
A more streamlined formulaic way of arriving
at the expected cell frequencies is simply this: For each cell,
multiply the marginal total for the row to which the cell belongs
by the marginal total for the column to which the cell belongs,
and then divide the result by the total number of cross-categorized
observations. That is, for any particular cell
Ecell = (R x C) / N
Where
R = the marginal total for the row to which the cell belongs
C = the marginal total for the column to which the cell belongs
N = the
total number of cross-categorized observations
The following illustration shows how this simple calculation works
out for each of the cells of the present example.
|
| 46 | ||
|
| 54 | ||
| 100 | ||||
And here is the procedure applied to our ERT/Alzheimer's
example. The marginal totals and observed cell frequencies (O)
are the same as shown when we introduced the example; within each
cell we also now include the values of E (in red)
that would be obtained by using the formula
Ecell = (R x C) / N
along with the appropriate values of R, C, and
N.
| Totals | |||
|
| ||
|
| ||
| Totals | |||
The calculation for cs2 is then
as follows:

The following graph shows the sampling distribution of chi-square
for the case of df=1. As indicated in the graph, our calculated
cs2_observed value of 11.31 is significant not only
at and beyond the minimal .05 level, but also beyond the levels
for .02, .01, and .001. Hence P < .001.
In brief, we can be very confident indeed
that the difference between the observed and MCE expected
patterns of frequencies does not result from mere random variability.

[Return to top.]
[Return to Main Menu]