MODELS USING TIME SERIES DATA
8
Now it is reasonable to suppose that
λ
lies between 0 and 1, in which case (1 –
λ
) will also lie between
0 and 1. Thus (1 –
λ
)
s
becomes progressively smaller as
s
increases. Eventually there will be a point
where the term
e
st
s
X
12
)1(
+−
−
λ
is so small that it can be neglected and we have a model in which all
the variables are observable.
A lag structure with geometrically-declining weights, such as this one, is described as having a
Koyck distribution. As can be seen from (12.10), it is highly parsimonious in terms of its
parameterization, requiring only one parameter more than the static version. Since it is nonlinear in
the parameters, OLS should not be used to fit it, for two reasons. First, multicollinearity would almost
certainly make the estimates of the coefficients so erratic that they would be worthless – it is precisely
this problem that caused us to search for another way of specifying a lag structure. Second, the point
estimates of the coefficients would yield conflicting estimates of the parameters. For example,
suppose that the fitted relationship began
t
Y
ˆ
= 101 + 0.60
X
t
+ 0.45
X
t
–1
+ 0.20
X
t
–2
+ ... (12.11)
Relating the theoretical coefficients of the current and lagged values of
X
in (12.10) to the estimates in
(12.11), one has
b
2
l
= 0.60,
b
2
l
(1 –
l
) = 0.45, and
b
2
l
(1 –
l
)
2
= 0.20. From the first two you could infer
that
b
2
was equal to 2.40 and
l
was equal to 0.25 – but these values would conflict with the third
equation and indeed with the equations for all the remaining coefficients in the regression.
Instead, a nonlinear estimation technique should be used instead. Most major regression
applications have facilities for performing nonlinear regressions built into them. If your application
does not, you could fit the model using a grid search. It is worth describing this technique, despite the
fact that it is obsolete, because it makes it clear that the problem of multicollinearity has been solved.
We rewrite (12.10) as two equations:
Y
t
=
β
1
+
β
2
Z
t
+
u
t
(12.12)
Z
t
=
λ
X
t
+
λ
(1 –
λ
)
X
t
–1
+
λ
(1 –
λ
)
2
X
t
–2
+
λ
(1 –
λ
)
3
X
t
–3
... (12.13)
The values of
Z
t
depend of course on the value of
λ
. You construct ten versions of the
Z
t
variable
using the following values for
λ
: 0.1, 0.2, 0.3, ..., 1.0 and fit (12.12) with each of them. The version
with the lowest residual sum of squares is by definition the least squares solution. Note that the
regressions involve a regression of
Y
on the different versions of
Z
in a simple regression equation and
so the problem of multicollinearity has been completely eliminated.
Table 12.3 shows the parameter estimates and residual sums of squares for a grid search where
the dependent variable was the logarithm of housing services and the explanatory variables were the
logarithms of
DPI
and the relative price series for housing. Eight lagged values were used. You can
see that the optimal value of
λ
is between 0.4 and 0.5, and that the income elasticity is about 1.13 and
the price elasticity about –0.32. If we had wanted a more precise estimate of
λ
, we could have
continued the grid search with steps of 0.01 over the range from 0.4 to 0.5. Note that the implicit
income coefficient for
X
t
–8
,
β
2
λ
(1 –
λ
)
8
, was about 1.13×0.5
9
= 0.0022. The corresponding price
coefficient was even smaller. Hence in this case eight lags were more than sufficient.