of multicollinearity, its diagnosis, and possible remedies. The chapter ends by con-
sidering two techniques designed to improve on OLS estimates in the presence of
severe collinearity: ridge regression and principal components regression.
MULTIPLE REGRESSION IN MATRIX NOTATION
The Model
In Section V of Appendix A, I outline the matrix representation of the multiple regres-
sion model. Let’s review the basic concepts covered there. Recall that the matrix rep-
resentation of the model for the ith observation is y
i
 x
i
ββ
 ε
i
, where x
i
 is a 1  p
vector of scores on the p regressors in the model for the ith observation. Here,
p  K  1, and the first regressor score is a “1” that serves as the regressor for the
intercept term. Further,
ββ
is a p  1 vector of the parameters in the model, with the
first parameter being the intercept, β
0
. Y
i
and ε
i
are the ith response score and the ith
error term, as always. The matrix representation of the model for all n of the y scores
is y  X
ββ
εε
. Here, y is an n  1 vector of response scores, X is an n  p matrix of
the regressor scores for all n observations, and 
εε
is an n  1 vector of equation errors
for the n observations. The ith row of X is, of course, x
i
. As always, it is assumed that
the errors have mean zero and constant variance σ
2
and are uncorrelated with each
other. These assumptions are encapsulated in the notation ε ⬃ f(0, σ
2
I). This means
that the errors have some density function, f(
.
) (typically assumed to be symmetric
about zero, but not necessarily normal except for small samples) with zero mean and
variance–covariance matrix σ
2
I. (Readers possibly used to the notation x
i
 for the rep-
resentation of the vector of regressor scores for the ith case may find the notation x
i
used in this book to be somewhat unusual. However, in that the ith case’s regressor
values are contained in the ith row of the n  p matrix of regressor values for all n
observations, and as I use the superscript i to denote row vectors, the use of x
i
 seems
more appropriate. Note that the ith case’s collection of regressor values written as a
column vector is therefore denoted x
i
throughout the book.)
OLS Estimates
The vector of OLS estimates of the model parameters is denoted b, and as noted in
Appendix A, its solution is b  (XX)
1
Xy. In Chapter 2 I noted that b
1
in SLR was
a weighted sum of the y
i
and therefore normally distributed in large samples, due to
the CLT. Similarly, each of the b
k
in the multiple regression model is a weighted
sum, or linear combination, of the y
i
and is therefore also asymptotically normal.
This is readily seen by denoting the p  n matrix (XX)
1
X by the symbol G, and
its kth row (where k  0,1,...,K) as g
k
. Then the kth regression estimate has the
form  g
k
y. Assuming that the X’s are fixed over repeated sampling (the standard
fixed-X assumption), this is nothing more than a weighted sum of the y’s. The esti-
mates are unbiased for their theoretical counterparts, since, as shown in Appendix A,
E(b) 
ββ
. The variance–covariance matrix for b, denoted V(b), is σ
2
(XX)
1
. The
variances of the parameter estimates lie on the diagonal of this matrix.
MULTIPLE REGRESSION IN MATRIX NOTATION 197