These are found by minimizing the sum of squared residuals with respect to the
parameter values. The least-squares solution vector, b, is found by solving the nor-
mal equations, which in matrix form are
XXb Xy.
The least-squares solution vector is therefore
b (XX)
1
Xy.
Now we come full circle and answer the original question posed at the beginning
of this particular tutorial (see Section V.A): How do we find the expected value and
variance of a coefficient estimate in the linear regression model? In fact, let’s find the
expected value and variance of the entire vector of linear regression estimates. First,
we will assume that the X-values are fixed over repeated sampling. This fixed-X
assumption is a standard assumption in linear regression, although it is routinely
violated. Nevertheless, the results we present hold asymptotically regardless of the
nature of the X’s (see, e.g., Greene, 2003). Moreover, we assume that we have a sam-
ple of n observations and p K 1 regressors, including the equation intercept, so
that y and
εε
have dimension n 1, X has dimension n p,and
ββ
has dimension p 1.
If X is fixed, the p n matrix (XX)
1
X is a matrix of constants. Call this matrix
A. Recall, in general, that if A is a matrix of constants, then y Ax is called a lin-
ear transformation of the vector x, and E(y) AE(x), V(y) AVA . Now let b be y
here, and let y be x. Then b Ay and we have that E(b) AE(y)
(XX)
1
XE(y) (XX)
1
X X
ββ
I
ββ
ββ
. This shows that the vector of estimates,
b, is unbiased for the parameter vector,
ββ
. Now what about V(b)? First, we need to
observe that if y X
ββ
εε
, then V(y) V(X
ββ
εε
) V(
εε
) σ
2
I V. (The term X
ββ
has no variance over repeated sampling, since X is fixed and
ββ
is also a collection of
constants.) We then have
V(b) AVA (XX)
1
Xσ
2
IX(XX)
1
σ
2
(XX)
1
XX(XX)
1
σ
2
(XX)
1
I σ
2
(XX)
1
.
Substituting the estimate of σ
2
[which is SSE/(n K 1)] into this last expression
gives us an estimate of the variance–covariance matrix of the regression parameter
estimates.
Application 3. Using matrix calculations to find the estimates of b
0
and b
1
in a sim-
ple linear regression. Just for practice, let’s use the matrix expression for b,
b (XX)
1
Xy, to calculate b for a simple linear regression model of four observa-
tions. It is then left as an exercise for the reader to verify that the same estimates are
obtained using the traditional SLR formulas for the intercept and slope (see Chapter
2). The four X-values are 2, 3.3, 3.9, and 7. The four Y-values are, respectively, 5, 2,
492 MATHEMATICS TUTORIALS