PROPERTIES OF THE REGRESSION COEFFICIENTS
8
If this condition is not satisfied, the OLS regression coefficients will be inefficient, and you
should be able to to obtain more reliable results by using a modification of the regression technique.
This will be discussed in Chapter 8.
Gauss–Markov Condition 3: u
i
Distributed Independently of u
j
(i
≠
j)
This condition states that there should be no systematic association between the values of the
disturbance term in any two observations. For example, just because the disturbance term is large and
positive in one observation, there should be no tendency for it to be large and positive in the next (or
large and negative, for that matter, or small and positive, or small and negative). The values of the
disturbance term should be absolutely independent of one another.
The condition implies that
ji
uu
σ
, the population covariance between
u
i
and
u
j
, is 0, because
ji
uu
σ
=
E
[(
u
i
–
µ
u
)(
u
j
–
µ
u
)] =
E
(
u
i
u
j
)
=
E
(
u
i
)
E
(
u
j
) = 0 (3.14)
(Note that the population means of
u
i
and
u
j
are 0, by virtue of the first Gauss–Markov condition, and
that
E
(
u
i
u
j
) can be decomposed as
E
(
u
i
)
E
(
u
j
) if
u
i
and
u
j
are generated independently – see the Review
chapter.)
If this condition is not satisfied, OLS will again give inefficient estimates. Chapter 13 discusses
the problems that arise and ways of getting around them.
Gauss–Markov Condition 4: u Distributed Independently of the Explanatory Variables
The final condition comes in two versions, weak and strong. The strong version is that the
explanatory variables should be nonstochastic, that is, not have random components. This is actually
very unrealistic for economic variables and we will eventually switch to the weak version of the
condition, where the explanatory variables are allowed to have random components provided that they
are distributed independently of the disturbance term. However, for the time being we will use the
strong version because it simplifies the analysis of the properties of the estimators.
It is not easy to think of truly nonstochastic variables, other than time, so the following example
is a little artificial. Suppose that we are relating earnings to schooling,
S
, in terms of highest grade
completed.
Suppose that we know from the national census that 1 percent of the population have
S
= 8,
3 percent have
S
= 9, 5 percent have
S
= 10, 7 percent have
S
= 11, 43 percent have
S
= 12 (graduation
from high school), and so on. Suppose that we have decided to undertake a survey with sample size
1,000 and we want the sample to match the population as far as possible. We might then select what is
known as a stratified random sample, designed so that it includes 10 individuals with
S
= 8, 30
individuals with
S
= 9, and so on. The values of
S
in the sample would then be predetermined and
therefore nonstochastic. Schooling and other demographic variables in large surveys drawn in such a
way as to be representative of the population as a whole, like the National Longitudinal Survey of
Youth, probably approximate this condition quite well.
If this condition is satisfied, it follows that
ii
uX
σ
, the population covariance between the
explanatory variable and the disturbance term is 0. Since
E
(
u
i
) is 0, and the term involving
X
is