28Amaro Forests - Chap 24 1/8/03 11:53 am Page 273
273 Procedures for Validating Growth and Yield Models
squares (OLS) estimator for β is obtained by minimizing the sum of squared errors.
This estimator (i.e. b = (XX)
1
XY) is unbiased, consistent and efficient with respect
to the class of linear unbiased estimators. It is the best in the sense that it has the
minimum variance among all linear unbiased estimators (Judge et al., 1988).
The regression fitted through OLS has a number of properties, including: (i) the
sum of the err
ors is zero, Σ(y
i
yˆ
i
) = Σe
i
= 0, where y
i
is the ith observed value and yˆ
i
is its prediction from the fitted model (i = 1, 2, …, n); and (ii) the sum of the squared
errors is a minimum, Σ(y
i
– yˆ
i
)
2
= Σe
i
2
= min. Because Σe
i
2
is a minimum, many sta-
tistical measures dependent on Σe
i
2
, such as the R
2
(coefficient of determination) and
the mean square error (MSE), are the ‘best’ for the OLS fit. A minimum MSE implies
that the model gives the highest precision on the model-fitting data. This will tend
to underestimate the inherent variability in making predictions from the fitted
model. It is invariably observed that the fitted model does not predict new data as
well as it fits existing data (Picard and Cook, 1984; Reynolds and Chung, 1986; Burk,
1988; Neter et al., 1989), and the least squares errors from the model-fitting data are
the ‘best’ (i.e. smallest), and are generally smaller than the prediction errors from the
model validation data. In fact, the fitted model from OLS will probably fit the sam-
ple data better than the true model would if it were known (Rawlings, 1988), and fit
statistics from the model-fitting data give an ‘over-optimistic’ assessment of the
model that is not likely to be achieved in real world applications (Reynolds and
Chung, 1986). Picard and Cook (1984) noted that ‘a naïve user could easily be misled
into believing that predictions are of higher quality than is actually the case’. The
understanding of this optimism principle is very important in model validation.
What it signifies is that a model can still be acceptable even though it is less accurate
and/or less precise on the validation data set.
Given a statistic such as Σe
i
or MSE from the model-fitting data set, the opti-
mism principle implies that one should expect that a similar quantity from the vali-
dation data set would be larger. However, if it is much larger, the fitted model is
likely to be inadequate for prediction because of a large prediction error and/or a
large variance. The relevant question becomes: how much larger can the statistic
from the validation data set be and still be considered acceptable? Many researchers
have considered such a question, but found there was really no simple answer
(Berk, 1984; Sargent, 1999).
Model Validation Procedures and Problems
In order to validate a growth and yield model, various procedures have been used.
Each of these procedures validates a part of a model, and each has limitations when
used in a segmental manner. The calls for a holistic approach are evident in Ek and
Monserud (1979), Buchman and Shifley (1983), Soares et al. (1995) and Vanclay and
Skovsgaard (1997). In general, the following procedures, each with its own short-
comings, need to be considered when validating a growth and yield model.
Visual or graphical validity
Since growth and yield models often consist of many submodels composed of func-
tions describing dif
ferent components, it is sometimes difficult or time-consuming, if
not impossible, to validate these models in an efficient manner. The difficulty is usu-
ally caused by the enormous number of potential interactions among the submod-
els, and the inability to judge the overall goodness of prediction of a model based on