To test the null hypothesis, we perform the χ
2
goodness-of-fit test. The hypothesized population has a
population proportion of successes equal to 0.85. We want to see if our sample is drawn from a population
that has the same distribution of successes and failures as the hypothesized distribution. We first calculate
the expected sample frequency in each category. If the null hypothesis is true, then, out of 9200 tubes,
7820 are expected to make it to market and 1380 are expected to be defective. We can compare the
observed and expected frequencies using Equation (4.33) to generate the χ
2
test statistic:
χ
2
¼
8420 7820
ðÞ
2
7820
þ
780 1380
ðÞ
2
1380
¼ 306:9:
There are two categories – success and failure, i.e. k = 2. However, the observed and expected
frequencies of scrapped (failed) material are dependent on the observed and expected frequencies in the
success category. This is because the total number of tubes is constrained to 9200. In other words, since
p þ q ¼ 1, then q ¼ 1 p. Once we know p, we will always know q. So, if we know the number of
successes and the total number of outcomes, then we are bound to know the number of failures. In other
words, the frequency of only one category can be varied freely, while the frequency in other category is
fixed by the first. Thus, the frequency count in the failure category, i.e. the last category, provides no new
information. The degrees of freedom associated with this problem are f ¼ k m 1, where m is the
number of population parameters such as mean and variance calculated from the data. Since no
parameters are calculated for this problem, m = 0, and f ¼ k 0 1 ¼ 1.
To clarify the point made above regarding the number of degrees of freedom, we recalculate the χ
2
test
statistic using the definition of χ
2
provided by Equation (4.32) as follows:
χ
2
¼
x npðÞ
2
npq
;
where x is the frequency of successes observed, np is the mean, npq is the variance of the binomial
distribution, and z ¼ðx npÞ=
ffiffiffiffiffiffiffiffiffi
npq
p
. This statistic is associated with one degree of freedom since it has
only one χ
2
term. Alternatively, we can use Equation (4.33) to calculate the test statistic. This equation has
two terms, which we write in terms of binomial variables as follows:
χ
2
¼
o
1
np
ðÞ
2
np
þ
o
2
nq
ðÞ
2
nq
:
Since o
1
þ o
2
¼ np þ nq ¼ n,
χ
2
¼
o
1
npðÞ
2
np
þ
n o
1
ðn npÞðÞ
2
nq
¼
o
1
npðÞ
2
np
þ
o
1
npðÞ
2
nq
¼
ðp þ qÞ o
1
npðÞ
2
npq
¼
x npðÞ
2
npq
:
Here, we have shown that Equations (4.32) and (4.33) are equivalent and that the test statistic is
associated with only one degree of freedom.
Now let’s calculate the p value associated with the calculated test statistic. We set the significance level
at 0.01, since we want to be certain whether a change in scrap rate has occurred:
44
p=1− chi2cdf(306.9, 1)
p=
0
Since p 0.01, we are very confident that the scrap rate has been lowered from 15%. In other words, the
success rate has increased. Note that, even though our hypothesis is non-directional, the direction of
the change is known because we can compare the observed and expected failure frequencies. Because there
are only two categories, it is obvious that if the frequencies in one category significantly increase, then
the frequencies in the second category have concomitantly decreased significantly. If there were more than
two categories, then the χ
2
test cannot tell us which category is significantly different from the others.
277
4.9 Chi-square tests for nominal scale data