Policy Favored
Smoking
Highest No Allowed in No
Education Restrictions Designated Smoking No
Level on Smoking Areas Only at All Opinion Total
College graduate 5 44 23 3 75
High-school graduate 15 100 30 5 150
Grade-school graduate 15 40 10 10 75
Total 35 184 63 18 300
Can one conclude from these data that, in the sampled population, there is a relationship between
level of education and attitude toward smoking in public places? Let
12.5 TESTS OF HOMOGENEITY
A characteristic of the examples and exercises presented in the last section is that, in each
case, the total sample was assumed to have been drawn before the entities were classified
according to the two criteria of classification. That is, the observed number of entities
falling into each cell was determined after the sample was drawn. As a result, the row and
column totals are chance quantities not under the control of the investigator. We think of
the sample drawn under these conditions as a single sample drawn from a single popula-
tion. On occasion, however, either row or column totals may be under the control of the
investigator; that is, the investigator may specify that independent samples be drawn from
each of several populations. In this case, one set of marginal totals is said to be fixed, while
the other set, corresponding to the criterion of classification applied to the samples, is ran-
dom. The former procedure, as we have seen, leads to a chi-square test of independence.
The latter situation leads to a chi-square test of homogeneity. The two situations not only
involve different sampling procedures; they lead to different questions and null hypothe-
ses. The test of independence is concerned with the question: Are the two criteria of clas-
sification independent? The homogeneity test is concerned with the question: Are the sam-
ples drawn from populations that are homogeneous with respect to some criterion of
classification? In the latter case the null hypothesis states that the samples are drawn from
the same population. Despite these differences in concept and sampling procedure, the two
tests are mathematically identical, as we see when we consider the following example.
Calculating Expected Frequencies Either the row categories or the column
categories may represent the different populations from which the samples are drawn. If,
for example, three populations are sampled, they may be designated as populations 1, 2, and
3, in which case these labels may serve as either row or column headings. If the variable
of interest has three categories, say, A, B, and C, these labels may serve as headings for
rows or columns, whichever is not used for the populations. If we use notation similar to
that adopted for Table 12.4.2, the contingency table for this situation, with columns used to
represent the populations, is shown as Table 12.5.1. Before computing our test statistic we
a = .05.
12.5 TESTS OF HOMOGENEITY 623