236 Genetics
it were larger, the fit would be worse. We still don’t know whether 3.312 is
small enough to consider the fit to be good.
Before proceeding, there is one other issue we must understand about χ
2
-
statistics. Because χ
2
involves adding up a number of positive terms, we
would expect its value to be larger whenever there are more terms. This is
captured in the idea of a parameter called the degrees of freedom. Counting
the degrees of freedom can be quite difficult, but a rule of thumb is that there
is one degree of freedom for each class whose size can vary freely. In this
example, if we imagine the size of the first class (the yellow seed phenotype)
varies freely (it could be any number from 0 to 327), then the size of the second
class (the green seed phenotype) is obtained by subtracting the first from the
total 327. This means we have one degree of freedom. More generally, if we
had n classes in a test, then the first n − 1 of them could range freely, but the
last is constrained. This corresponds to n − 1 degrees of freedom. The more
degrees of freedom in a test, the larger you might find the χ
2
-statistic to be,
because it requires summing more positive numbers. To judge the size of a
particular χ
2
-statistic, we must take this into account.
With the degrees of freedom specified, statisticians have studied the χ
2
distribution. Although a formula for the distribution is too complicated to
give here, information from it is incorporated in tables and in software. This
makes it possible to compute, for a specified number of degrees of freedom,
the probability that the χ
2
-value lies in any specified range, assuming the
hypothesis holds.
Keep in mind that, even when the hypothesis is true, every time we do an
experiment, we will get different data and a different χ
2
-statistic describing
the fit. Most of these will be small, but some will be large because of chance.
We would like our goodness-of-fit test to be flexible enough to accommodate
this variation. So, to decide whether we consider our value of χ
2
to be too
large for the data to fit the hypothesis, we pick a significance level, for instance
α = .05. This means we decide to view χ
2
as too large if the probability of
getting a lower value is at least 1 − α = 95% when the hypothesis is true.
If we consult a table, such as the abbreviated Table 6.7 at the end of this
section, we find that the critical value for a χ
2
-statistic with one degree of
freedom at the .05 level of significance is χ
2
critical
= 3.841. This means that,
assuming the hypothesis is correct, only 5% of the time would we calculate a
value of χ
2
that was 3.841 or larger. Thus, if our statistic is larger than 3.841,
we say the data do not support our hypothesis at the .05 level of significance.
However, if our statistic is less than 3.841, we find that the data do support
the hypothesis at the .05 level of significance.