10.5.5 Refer to Exercise 10.3.5 and let and
10.5.6 Refer to Exercise 10.3.6 and let and
10.6 THE MULTIPLE CORRELATION MODEL
We pointed out in the preceding chapter that while regression analysis is concerned with
the form of the relationship between variables, the objective of correlation analysis is to
gain insight into the strength of the relationship. This is also true in the multivariable
case, and in this section we investigate methods for measuring the strength of the rela-
tionship among several variables. First, however, let us define the model and assump-
tions on which our analysis rests.
The Model Equation We may write the correlation model as
(10.6.1)
where is a typical value from the population of values of the variable Y, the are
the regression coefficients defined in Section 10.2, and the are particular (known) val-
ues of the random variables This model is similar to the multiple regression model,
but there is one important distinction. In the multiple regression model, given in Equa-
tion 10.2.1, the are nonrandom variables, but in the multiple correlation model the
are random variables. In other words, in the correlation model there is a joint distribu-
tion of Y and the that we call a multivariate distribution. Under this model, the vari-
ables are no longer thought of as being dependent or independent, since logically they
are interchangeable and either of the may play the role of Y.
Typically, random samples of units of association are drawn from a population of
interest, and measurements of Y and the are made.
A least-squares plane or hyperplane is fitted to the sample data by methods
described in Section 10.3, and the same uses may be made of the resulting equation.
Inferences may be made about the population from which the sample was drawn if it
can be assumed that the underlying distribution is normal, that is, if it can be assumed
that the joint distribution of Y and is a multivariate normal distribution. In addition,
sample measures of the degree of the relationship among the variables may be computed
and, under the assumption that sampling is from a multivariate normal distribution, the
corresponding parameters may be estimated by means of confidence intervals, and
hypothesis tests may be carried out. Specifically, we may compute an estimate of the
multiple correlation coefficient that measures the dependence between Y and the This
is a straightforward extension of the concept of correlation between two variables that we
discuss in Chapter 9. We may also compute partial correlation coefficients that measure
the intensity of the relationship between any two variables when the influence of all other
variables has been removed.
The Multiple Correlation Coefficient As a first step in analyzing
the relationships among the variables, we look at the multiple correlation coefficient.
X
i
.
X
i
X
i
X
i
X
i
X
i
X
i
X
i
.
x
ij
b’sy
j
y
j
= b
0
+ b
1
x
1j
+ b
2
x
2j
+
Á
+ b
k
x
kj
+P
j
x
6j
= 70.
x
4j
= 6.00, x
5j
= 75,x
3j
= 2.00,x
1j
= 50, x
2j
= 95.0,
x
2j
= 80.x
1j
= 90
506 CHAPTER 10 MULTIPLE REGRESSION AND CORRELATION