chap-09 4/6/2004 17: 25 page 214
214 GEOMETRIC MORPHOMETRICS FOR BIOLOGISTS
Computation of the F-ratio is complicated by the fact that the variance explained by
the categorical variable cannot be calculated directly from the data but must be computed
indirectly as the difference between the total variance and the variance that is not explained.
If this seems strangely convoluted, look again at Figure 9.2. We can compute the deviations
of each individual from the grand mean and use them to compute the total variance. We
can also compute the deviations of each individual from its respective group mean and
use them to compute the unexplained variance (the variance within the groups cannot be
attributed to the factor responsible for the difference between the groups). Subtracting the
unexplained variance from the total variance leaves the explained variance. For the natural
logs of jaw size, the sum of squared deviations from the grand mean is 0.0207. The sums
of squared deviations from the group means are 0.0055 for males and 0.0128 for females
for a total within-groups sum of squares of 0.0183. The difference between the total and
within-groups sums of squares is 0.0024; this is the between-groups sum of squares that
can be used to compute the variance explained by the categorical variable, sex.
Variance is the sum of the squared deviations divided by the number of degrees of free-
dom. The total degrees of freedom are N −1. The number of degrees of freedom attributed
to the categorical variable is G −1, where G is the number of groups or categories. (There
are fewer degrees of freedom than classes, because an individual that does not belong to the
first G −1 groups necessarily belongs to the last one.) Subtracting the degrees of freedom
allotted to the explained variance leaves N −G degrees of freedom for the unexplained vari-
ance. Returning to our example, N =58 and G =2. The explained variance (due to sexual
dimorphism) is 2.4 ×10
−3
/1, and the unexplained variance is 0.0183/56 =0.33 ×10
−3
.
The explained variance divided by the unexplained is 7.3. This F-ratio, with 1 and 57
degrees of freedom, has a p-value of 0.0091, which is identical to the p-value that was
obtained from the t-test. Thus, despite taking different approaches, the F-test and t-test
lead to the same result – the same conclusion regarding the significance of the difference
between the two groups.
It is important to remember that both the t-test and the F-test assume that the variances
within the groups are the same. Furthermore, both tests are asking whether samples as
different as yours could have been drawn from a single sample with a specific known
variance. Fortunately, both tests are fairly robust to violation of the assumption of equal
variances.
One simple trait, more than two groups
Because ANOVA compares variation within groups to variation between groups, it can
also be applied to analyses that examine more than two groups. For example, Table 9.2
illustrates an analysis of geographic variation in jaw size (the rows in the lower half of this
table are not in the order usually reported, but in an order that corresponds more closely
to the sequence of calculations). The categorical variable is geographic location, which
has three classes referring to three collecting areas (eastern Michigan, western Michigan,
and southern states). To test whether there is significant geographic variation in jaw size
(to test whether there are significant differences among the three populations), we follow
exactly the same procedure as for two groups. The sum of squared distances of individuals
from the grand mean is the total sum of squares (SSQ), and the sum of squared distances
of the individuals from their class means is the unexplained sum of squares. The difference