chap-08 4/6/2004 17: 25 page 196
196 GEOMETRIC MORPHOMETRICS FOR BIOLOGISTS
random number from 1 to N is generated by a random number generator. The corre-
sponding element from the original set of observations then forms the first element in the
bootstrap set. For example, given our 31 observations, we will construct a sample that
also has 31 observations. The number provided by the random number generator is 8, so
we take the value of the eighth individual of our sample as the first value in the bootstrap
set. This procedure is repeated N times. Note that a single value from the original data
set may appear multiple times in a bootstrap set (this is because we are sampling with
replacement, meaning that we do not remove an individual from the sample after we have
placed its value in the bootstrap set). Additionally, not all values in the original set need
appear in the bootstrap set.
To develop an understanding of how a bootstrap set is formed, we’ll consider an
abstract, symbolic example. Suppose C contains five values:
C ={C
1
, C
2
, C
3
, C
4
, C
5
} (8.7)
To form a bootstrap version of C, we generate a list of five random numbers, each
independently chosen and ranging from 1 to 5 (because N =5):
L ={52435} (8.8)
The numbers in L are the ordinal positions of the elements of C; C
Bootstrap
contains the
corresponding values of C (e.g. L
1
=5, so it corresponds to the fifth element of C, which
is C
5
). Thus:
C
Bootstrap
={C
5
, C
2
, C
4
, C
3
, C
5
} (8.9)
Note that C
1
does not appear in this bootstrap set, while C
5
appears twice.
Returning to the numerical example presented earlier:
X ={2, 2, 3, 4, 2, 5, 3, 2, 6, 2, 3, 4, 6, 2, 1, 4, 3, 7, 2, 3, 4, 4, 5, 8, 5, 2, 1, 3, 4, 4, 3} (8.10)
To form a bootstrap set, X
Boot
, from X, we generate the list, B, of 31 random numbers:
B ={30, 8, 19, 16, 28, 24, 15, 1, 26, 14, 20, 25, 29, 23, 6, 13, 29, 29,
13, 28, 2, 11, 26, 1, 5, 7, 7, 19, 9, 7, 1} (8.11)
We then select the elements of X corresponding to those ordinal values:
X
Boot
={4, 2, 2, 4, 3, 8, 1, 2, 2, 2, 3, 5, 4, 5, 5, 6, 4, 4, 6, 3, 2, 3, 2, 2, 2, 3, 3, 2, 6, 3, 2}
(8.12)
The first element of X
Boot
is the 30th element of X (because 30 is the first element of B),
and the seventh element of X appears three times in the bootstrap set (because 7 appears
three times in B). We can now calculate the mean, standard deviation and median of X
Boot
:
<X
Boot
> =3.39, σ
X
Boot
=1.62, and median(X
Boot
) =3. These values are slightly different
from those of the original distribution, <X> =3.52; σ =1.69, and median(X) =3.0.
To arrive at an estimate of the confidence intervals for these statistics, we will compute
a large number (N
Bootstrap
) of bootstrap sets. We will then determine the 95% confidence
interval over the N
Bootstrap
sets, forming a bootstrap estimate of the confidence intervals