MEDIANS AND RESAMPLING 137
10,000 resamples from the Early Classic site area batch from Table 10.1. The mean
and standard deviation would be poor indexes of center and spread for this batch, so
they would not provide us with a useful approach to unusualness within the batch.
The median of this batch of 10,000 resample medians, though, is 28.2 ha, the same
as the median of the original sample and a good index of the center of the special
batch as well.
We saw in Chapter
4 that percentiles, familiar to students from the reports of
standardized tests, are a way of characterizing unusualness, and it is percentiles
that provide the most useful way to approach unusualness in a very non-normal
batch like the one in Fig
10.2. This special batch can be taken to represent the set of
medians of populations our sample might have come from. In order to find an error
range for, say, a 90% confidence level to attach to the median of 28.2ha, we would
look in this batch of 10,000 resample medians for the 5th and 95th percentiles. That
is, the middle 90% of resample medians would represent the range within which
we would be 90% confident that the median of the population lies. We would, then,
want to find the number below which 5% of the medians fall, and the number above
which 5% of the medians fall, leaving 90% of the resample medians between these
two numbers. Since 5% of the 10,000 medians would be 500, we would want the
500th and 9,500th numbers in the batch (either counting up from the lowest or down
from the highest). For this special batch these two numbers are 24.6 and 35.0 ha.
Finally, then, we would estimate the median site area for the innumerably large
population of all Early Classic sites in our region as 28.2 ha. And we would be
90% confident that the median in this population lies between 24.6 ha and 35.0 ha.
As is usual with bootstrapped error ranges for the median, the error range is not
symmetrical. It runs from 3.6 ha below the median of 28.2 ha to 6.8 ha above it and
thus cannot be expressed as a ± figure. An error range for any particular confidence
level can be determined by selecting appropriate percentiles. An error range for the
95% confidence level lies between the 2.5th and 97.5th percentiles; for the 98%
confidence level, between the 1st and 99th percentile; and for the 99% confidence
level, between the 0.5th and 99.5th percentile.
Statpacks
Resampling approaches like the bootstrap have been somewhat slow to appear
in statpacks, but their presence is getting more common. Finding an error range
for the median with the bootstrap is still likely, however, to involve more than
simply selecting that option from a single menu. It may involve choosing an
option to perform resampling, selecting the bootstrap as the resampling tech-
nique to be used, setting how many resamples are to be chosen (usually at least
1,000), and specifying that the median is the desired statistic. The statpack is
then likely to save the medians from all the resamples in a new data file, within
which you will need to find the appropriate percentiles to establish the size of
the error range for the desired confidence level.