CHAPTER 17
TEACHING NOTES
I emphasize to the students that, first and foremost, the reason we use the probit and logit models
is to obtain more reasonable functional forms for the response probability. Once we move to a
nonlinear model with a fully specified conditional distribution, it makes sense to use the efficient
estimation procedure, maximum likelihood. It is important to spend some time on interpreting
probit and logit estimates. In particular, the students should know the rules-of-thumb for
comparing probit, logit, and LPM estimates. Beginners sometimes mistakenly think that,
because the probit and especially the logit estimates are much larger than the LPM estimates, the
explanatory variables now have larger estimated effects on the response probabilities than in the
LPM case. This may or may not be true.
I view the Tobit model, when properly applied, as improving functional form for corner solution
outcomes. In most cases it is wrong to view a Tobit application as a data-censoring problem
(unless there is true data censoring in collecting the data or because of institutional constraints).
For example, in using survey data to estimate the demand for a new product, say a safer pesticide
to be used in farming, some farmers will demand zero at the going price, while some will
demand positive pounds per acre. There is no data censoring here; some farmers find it optimal
to use none of the new pesticide. The Tobit model provides more realistic functional forms for
E(y|x) and E(y|y > 0,x) than a linear model for y. With the Tobit model, students may be tempted
to compare the Tobit estimates with those from the linear model and conclude that the Tobit
estimates imply larger effects for the independent variables. But, as with probit and logit, the
Tobit estimates must be scaled down to be comparable with OLS estimates in a linear model.
(See Equation (17.27); for an example, see Computer Exercise 17.10.)
Poisson regression with an exponential conditional mean is used primarily to improve over a
linear functional form for E(y|x). The parameters are easy to interpret as semi-elasticities or
elasticities. If the Poisson distributional assumption is correct, we can use the Poisson
distribution compute probabilities, too. But overdispersion is often present in count regression
models, and standard errors and likelihood ratio statistics should be adjusted to reflect this.
Some reviewers of the first edition complained about either the inclusion of this material or its
location within the chapter. I think applications of count data models are on the rise: in
microeconometric fields such as criminology, health economics, and industrial organization,
many interesting response variables come in the form of counts. One suggestion was that
Poisson regression should not come between the Tobit model in Section 17.2 and Section 17.4,
on censored and truncated regression. In fact, I put the Poisson regression model between these
two topics on purpose: I hope it helps emphasize that the material in Section 17.2 is purely about
functional form, as is Poisson regression. Sections 17.4 and 17.5 deal with underlying linear
models, but where there is a data-observability problem.
Censored regression, truncated regression, and incidental truncation are used for missing data
problems. Censored and truncated data sets usually result from sample design, as in duration
analysis. Incidental truncation often arises from self-selection into a certain state, such as
employment or participating in a training program. It is important to emphasize to students that
159