the t statistic is very small. Thus, from this cross-sectional analysis, we must conclude
that the grants had no effect on firm productivity. We will return to this example in
Chapter 9 and show how adding information from a prior year leads to a much differ-
ent conclusion.
Even in cases where the policy analysis does not involve assigning units to a con-
trol group and a treatment group, we must be careful to include factors that might be
systematically related to the binary independent variable of interest. A good example of
this is testing for racial discrimination. Race is something that is not determined by an
individual or by government administrators. In fact, race would appear to be the perfect
example of an exogenous explanatory variable, given that it is determined at birth.
However, for historical reasons, this is not the case: there are systematic differences in
backgrounds across race, and these differences can be important in testing for current
discrimination.
As an example, consider testing for discrimination in loan approvals. If we can col-
lect data on, say, individual mortgage applications, then we can define the dummy
dependent variable approved as equal to one if a mortgage application was approved,
and zero otherwise. A systematic difference in approval rates across races is an indica-
tion of discrimination. However, since approval depends on many other factors, includ-
ing income, wealth, credit ratings, and a general ability to pay back the loan, we must
control for them if there are systematic differences in these factors across race. A linear
probability model to test for discrimination might look like the following:
approved
0
1
nonwhite
2
income
3
wealth
4
credrate other factors.
Discrimination against minorities is indicated by a rejection of H
0
:
1
0 in favor of
H
0
:
1
0, because
1
is the amount by which the probability of a nonwhite getting an
approval differs from the probability of a white getting an approval, given the same lev-
els of other variables in the equation. If income, wealth, and so on are systematically
different across races, then it is important to control for these factors in a multiple
regression analysis.
Another problem that often arises in policy and program evaluation is that individ-
uals (or firms or cities) choose whether or not to participate in certain behaviors or pro-
grams. For example, individuals choose to use illegal drugs or drink alcohol. If we want
to examine the effects of such behaviors on unemployment status, earnings, or criminal
behavior, we should be concerned that drug usage might be correlated with other fac-
tors that can affect employment and criminal outcomes. Children eligible for programs
such as Head Start participate based on parental decisions. Since family background
plays a role in Head Start decisions and affects student outcomes, we should control for
these factors when examining the effects of Head Start [see, for example, Currie and
Thomas (1995)]. Individuals selected by employers or government agencies to partici-
pate in job training programs can participate or not, and this decision is unlikely to be
random [see, for example, Lynch (1991)]. Cities and states choose whether to imple-
ment certain gun control laws, and it is likely that this decision is systematically related
to other factors that affect violent crime [see, for example, Kleck and Patterson (1993)].
The previous paragraph gives examples of what are generally known as self-
selection problems in economics. Literally, the term comes from the fact that individu-
als self-select into certain behaviors or programs: participation is not randomly deter-
Chapter 7 Multiple Regression Analysis With Qualitative Information: Binary (or Dummy) Variables
239
d 7/14/99 5:55 PM Page 239