MULTIPLE REGRESSION ANALYSIS
8
Exercises
4.1
The result of fitting an educational attainment function, regressing S on ASVABC, SM, and SF,
years of schooling (highest grade completed) of the respondent’s mother and father,
respectively, using EAEF Data Set 21 is shown below. Give an interpretation of the regression
coefficients.
. reg S ASVABC SM SF
Source | SS df MS Number of obs = 570
---------+------------------------------ F( 3, 566) = 110.83
Model | 1278.24153 3 426.080508 Prob > F = 0.0000
Residual | 2176.00584 566 3.84453329 R-squared = 0.3700
---------+------------------------------ Adj R-squared = 0.3667
Total | 3454.24737 569 6.07073351 Root MSE = 1.9607
------------------------------------------------------------------------------
S | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
ASVABC | .1295006 .0099544 13.009 0.000 .1099486 .1490527
SM | .069403 .0422974 1.641 0.101 -.013676 .152482
SF | .1102684 .0311948 3.535 0.000 .0489967 .1715401
_cons | 4.914654 .5063527 9.706 0.000 3.920094 5.909214
------------------------------------------------------------------------------
4.2
Fit an educational attainment function parallel to that in Exercise 4.1, using your EAEF data set,
and give an interpretation of the coefficients.
4.3
Fit an earnings function parallel to that in Section 4.1, using your EAEF data set, and give an
interpretation of the coefficients.
4.4
Using your EAEF data set, make a graphical representation of the relationship between S and
SM using the technique described above, assuming that the true model is as in Exercise 4.2. To
do this, regress S on ASVABC and SF and save the residuals. Do the same with SM. Plot the S
and SM residuals. Also regress the former on the latter, and verify that the slope coefficient is
the same as that obtained in Exercise 4.2.
4.5*
Explain why the intercept in the regression of EEARN on ES is equal to 0.
4.3 Properties of the Multiple Regression Coefficients
As in the case of simple regression analysis, the regression coefficients should be thought of as special
kinds of random variables whose random components are attributable to the presence of the
disturbance term in the model. Each regression coefficient is calculated as a function of the values of
Y and the explanatory variables in the sample, and Y in turn is determined by the explanatory variables
and the disturbance term. It follows that the regression coefficients are really determined by the
values of the explanatory variables and the disturbance term and that their properties depend critically
upon the properties of the latter.
We shall continue to assume that the Gauss–Markov conditions are satisfied, namely (1) that the
expected value of u in any observation is 0, (2) that the population variance of its distribution is the
same for all observations, (3) that the population covariance of its values in any two observations is 0,
and (4) that it is distributed independently of any explanatory variable. The first three conditions are