13.3.2 A worked example of a box-and-whiskers plot
This example uses a sample with an odd number of values (n = 9): 1, 3, 4, 6,
7, 9, 10, 12, 25. The median of this sample is 7, so it is divided into two
groups where the lower group contains 1, 3, 4, 6 and 7, while the upper
group contains 7, 9, 10, 12 and 25. The median of the lower group is 4, which
becomes the lower quartile. The median of the upper group is 10, which
becomes the upper quartile. These are the limits of the ends of the box
(called the hinges).
The interquartile range is 10 − 4 = 6 units. From this you can draw the
rectangular box in Figure 13.2(a). The maximum potential length of each
whisker is 1.5 times the interquartile range and thus 1.5 × 6 = 9. This is
shown in Figure 13.2(b). Each whisker can extend out a maximum of 9 units
from its hinge. Because each whisker is only drawn to the most extreme
value within its potential range, the lower whisker will only extend
down to 1, while the upper will only extend up to 12. The outlier of 25,
indicated by an asterisk, lies outside the range of the box and its whiskers
(Figure 13.2(c)).
The shape of the box-and-whiskers plot indicates whether the distri-
bution is skewed. If the distribution of the data is symmetrical about the
mean the box-and-whiskers plot will have a median equidistant from
the hinges, and whiskers that are of similar length. As the distribution
becomes increasingly skewed the median will become less equidistant
from the hinges and the whiskers will have different lengths
(Figure 13.3).
Any values outside the range of the whiskers are called outliers
and should be scrutinized carefully. In some cases outliers are obvious
mistakes caused by incorrect data entry or recording, faulty equipment
or inappropriate methodology (e.g. a daily temperature of −80
o
Cora
negative radiometric age date) in which case they can justifiably be
deleted. When outliers appear to be real, they are of great interest
because they may indicate that something unusual is occurring, espe-
cially if they are present in some samples or treatments and not others.
Importantly, however, when there are outliers you should be cautious
about using a parametric test. One or two extreme values can greatly
affect the variance of a sample because the formula for the variance uses
the square of the difference between each value and the mean, so the
13.3 Normally distributed data 169