Censored Regression Models
While censored regression models can be defined without distributional assumptions,
in this subsection, we study the censored normal regression model. The variable we
would like to explain, y, follows the classical linear model. For emphasis, we put an i
subscript on a random draw from the population:
y
i
0
x
i

u
i
, u
i
兩x
i
,c
i
~ Normal(0,
2
) (17.33)
w
i
min(y
i
,c
i
). (17.34)
Rather than observing y
i
, we only observe it if it is less than a censoring value, c
i
. Notice
that (17.33) includes the assumption that u
i
is independent of c
i
(at least once we con-
dition on the x
i
). (For concreteness, we
explicitly consider censoring from above,
or right censoring; the problem of censor-
ing from below, or left censoring, is han-
dled similarly.)
One example of right data censoring is
top coding. When a variable is top coded,
we know its value only up to a certain
threshold. For responses greater than the
threshold, we only know that the variable
is at least as large as the threshold. For
example, in some surveys, family wealth
is top coded. Suppose that respondents are asked their wealth, but people are allowed
to respond with “more than $500,000.” Then, we observe actual wealth for those
respondents whose wealth is less than $500,000 but not for those whose wealth is
greater than $500,000. In this case, the censoring threshold, c
i
, is the same for all i. In
many situations, the censoring threshold changes with individual or family character-
istics.
If we observed a random sample for (x,y), we would simply estimate

by OLS, and
statistical inference would be standard. (We again absorb the intercept into x for sim-
plicity.) The censoring causes problems. Using arguments similar to the Tobit model,
an OLS regression using only the uncensored observations—that is, those with
y
i
c
i
—produces inconsistent estimators of the
j
. An OLS regression of w
i
on x
i
,
using all observations, does not consistently estimate the
j
, unless there is no censor-
ing. This is similar to the Tobit case, but the problem is much different. In the Tobit
model, we are modeling economic behavior, which often yields zero outcomes; the
Tobit model is supposed to reflect this. With censored regression, we have a data col-
lection problem because, for some reason, the data are censored.
Under the assumptions in (17.33) and (17.34), we can estimate

(and
2
) by max-
imum likelihood, given a random sample on (x
i
,w
i
). For this, we need the density of w
i
,
given (x
i
,c
i
). For uncensored observations, w
i
y
i
, and the density of w
i
is the same as
that for y
i
: Normal(x
i

,
2
). For censored observations, we need the probability that w
i
equals the censoring value, c
i
, given x
i
:
P(w
i
c
i
兩x
i
) P(y
i
c
i
兩x
i
) P(u
i
c
i
x
i

) 1 [(c
i
x
i

)/
].
Part 3 Advanced Topics
552
QUESTION 17.5
Let mvp
i
be the marginal value product for worker i; this is the price
of a firm’s good multiplied by the marginal product of the worker.
Assume mvp
i
is a linear function of exogenous variables, such as
education, experience, and so on, as well as being an unobservable
error. Under perfect competition and without institutional con-
straints, each worker is paid his or her marginal value product. Let
minwage
i
denote the minimum wage for worker i, which varies by
state. We observe wage
i
, which is the larger of mvp
i
and minwage
i
.
Write the appropriate model for the observed wage.
d 7/14/99 8:28 PM Page 552