6.2 THEORY OF NESTED SAMPLING AND ANALYSIS
The model of nested variation is based on the notion that a population can be
divided into classes at two or more categoric levels in a hierarchy. The
population can then be sampled with a multi-stage (multi-level) or nested
scheme to estimate the variance at each level. The population is divided initially
into classes at stage 1, and these are subdivided at stage 2 into subclasses,
which in turn can be subdivided further at stage 3 to give finer classes, and so
on, to form a nested or hierarchical classification with m stages. In each case
the class at the lower level is contained completely within the one immediately
above it, and each sampling point is contained in one and only one class at each
and ever y level. The system is a strict hierarchy, and a single observation
embodies variation contributed by each of the stages, including an unresolved
variance within the classes at the finest level of resolution. We can estimate
these contributions to the variance by a hierarchical analysis of variance
(ANOVA).
Youden and Mehlich (1937) saw that for an attribute distributed in space the
stages could be represented by a hierarchy corresponding to different distances.
They adapted classical multi-stage sampling so that each stage in the hierarchy
represented a distance between sampling points. They sampled at random, with
only the distances between pairs fixed, and so the random effects model, model
II of Marcuse (1949), is appropriate for the ANOVA.
For a design with m stages the data are modelled as
Z
ijk...m
¼ m þ A
i
þ B
ij
þþ"
ijk...m
; ð6:9Þ
where Z
ijk...m
is the value of the mth unit in ..., in the kth class at stage 3, in the
jth class at stage 2, and in the ith class at stage 1. The general mean is m; A
i
is
the difference between m and the mean of class i in the first category; B
ij
is the
difference between the mean of the jth subclass in class i and the mean of class i;
and so on. The final quantity "
ijk...m
represents the deviation of the obse rved
value from its class mean at the last stage of subdivision. The quantities
A
i
; B
ij
; C
ijk
; ...;"
ijk...m
are assumed to be independent random variables asso-
ciated with stages 1; 2; 3; ...; m, respectively, having means of zero and
variances s
2
1
; s
2
2
; s
2
3
; ...; s
2
m
. The latter are the compon ents of variance for
the respective stages. They are estimated according to the scheme in Table 6.4.
The quantities n
1
; n
2
; n
3
; ...; n
m
, in the table are the numbers of subdivisions of
each class at the several levels. If for each stage, say j, n
j
is constant then the
design is balanced. All the n
j
; j ¼ 1; 2; ...; n
m
, are known for any particular
design, and so we can dete rmine the components of variance for all stages in
the classification and the residual variance from the right-hand column of
Table 6.4.
Theory of Nested Sampling and Analysis 127