These are then used to obtain revised estimates of γ
0
and γ
1
, since
γ
ˆ
0
g
0
(0)
b
0
(0)
,
γ
ˆ
1
g
1
(0)
b
1
(0)
.
We then take γ
ˆ
0
and γ
ˆ
1
as our new starting values for equation (5.16) and reestimate
the equation, again using OLS. We continue in this fashion, each time updating
our old estimates, plugging the updates into equation (5.16), and reestimating the
parameters until the difference between successive coefficient estimates and/or the
difference in successive SSE’s becomes negligible. If the procedure is working cor-
rectly, SSE should continue to get smaller with each successive iteration. At the pres-
ent time, many software programs for nonlinear regression (e.g., SAS) require the
user to supply the expressions for the first partial derivatives of the model with
respect to the parameters, as well as the starting values for the parameters, in order
for the program to run. Assuming that the assumptions on the errors are valid, the
resulting parameter estimates are approximately efficient, unbiased, and normally
distributed in large samples. This means that for large n, the usual regression test sta-
tistics are applicable.
Estimates for the exponential model with additive errors, based on the
Gauss–Newton procedure, are shown in the column “exponential model” in panel (a)
of Table 5.5. They can be compared to those for the log y model by noting that the
comparable parameter in the log y model to the intercept in the exponential model is
exp(2.8678) 17.5983. Although the intercepts are slightly different, the coefficient
for X (diagnostic quiz score) is virtually the same in each model. R
2
for the expo-
nential model is calculated as the square of the correlation between its fitted values
(y
ˆ
) and Y. Again, this is virtually identical to the R
2
for the log y model. Panel (b) of
the table shows the iteration history for the model. The initial estimates are in the
row labeled “iteration 0” and are just the parameter estimates from the log y model.
Convergence, in this case, was quite rapid, occurring in three iterations. Ratkowsky
(1990) observes that convergence to the least squares estimates usually occurs fairly
rapidly from reasonable starting values, especially for relatively simple models. In
fact, even if one mistakenly uses 2.8678 instead of exp(2.8678) as the starting value
for the intercept, convergence still occurs in five iterations. At any rate, all three
models for final exam score in this example appear to produce approximately the
same substantive conclusion regarding the impact of diagnostic quiz scores. In this
particular instance there is no special advantage to employing the nonlinear model.
However, in other cases, it may be the only suitable choice.
EXERCISES
5.1 Based on a probability sample of 680 married couples, Mirowsky (1985)
examined the relationship between depression (Y), a continuous scale ranging
from 0 (“no depression”) to 112 (“maximum depression”) and marital power
190 MODELING NONLINEARITY