40 CHAPTER 4
in Fig. 4.1). Since the bottom of the box represents the lower quartile, the position of
the defining line for low outliers is the lower quartile minus 1.5 times the midspread
(15.7 −4.65 = 11.05cm in the example in Fig.
4.1).
In the same way, the positions of the defining lines for far outliers can be estab-
lished mathematically. The line defining far high outliers is twice as far above
the upper quartile as the line defining outliers. That is to say, instead of 1.5 times the
midspread beyond the quartiles, the far outlier defining line falls at three times the
midspread beyond the quartiles (18.8 + 9.3 = 28.1cm and 15.7 −9.3 = 6.4cm in
Fig.
4.1).
Thus the areas above and below the box in the box-and-dot plot are each divided
into three zones. Numbers that fall in the nearest zone above or below the box are
called adjacent values. These numbers are outside the central half of the batch but
are still considered part of the main bunch of numbers. In the next zone away from
the median come outliers, and in the farthest zone are far outliers. Ordinarily these
zones are not indicated by lines the way they are in Fig.
4.1. Instead, they are distin-
guished by different symbols representing the numbers that fall in them. The highest
and lowest adjacent values are indicated with X’s, as shown in Fig.
4.1. These X’s,
then, represent the extremes of the main bunch of numbers (excluding all outliers).
Outliers are all indicated individually on the plot as hollow dots, and far outliers are
all indicated individually as solid dots. The batch represented in Fig.
4.1 has only
one outlier (8.4 cm) and no far outliers, so there is a single hollow dot and no solid
dots. These conventions about X’s, hollow dots, and solid dots stand for the labels
and lines drawn to the right of Fig.
4.1, so such labels and defining lines do not gen-
erally appear when box-and-dot plots are drawn. As is the case with rules of thumb,
the exact conventions used to indicate outliers and far outliers in box-and-dot plots
vary from one book or program to the next.
The box-and-dot plot makes it easy to compare several batches. In Chapter 1, we
compared the batch used for the example in Fig.
4.1 to another batch of post hole
diameters with a back-to-back stem-and-leaf plot (Table
1.7). Figure 4.2 compares
the same two batches with two box-and-dot plots instead. The box-and-dot plot for
post hole diameters at the Smith site is exactly the same as in Fig.
4.1 (except that it
is now on a longer scale). The box-and-dot plot for post hole diameters at the Black
site is made in exactly the same manner, but using the numbers listed in Table
1.1
for the Black site. The one extremely large post hole qualifies not only as an outlier,
but as a far outlier, since it lies more than three times the length of the box from the
box’s upper end. It is thus shown as a solid dot.
When we look at the box-and-dot plots in Fig.
4.2, we quickly reach the same
conclusion we reached looking at the back-to-back stem-and-leaf plot of these same
numbers in Table
1.7. At each site there is a post hole that does not seem to represent
the same kind of phenomenon as the rest of the post holes – an extremely large
post hole at the Black site and an extremely small post hole at the Smith site. In
general, post holes at the Smith site are larger than post holes at the Black site by a
margin of 5 or 6 cm. The box-and-dot plot shows us these patterns even more clearly
than the back-to-back stem-and-leaf plot because the box-and-dot plot is a simpler,
more quickly perceived way of representing the basic features of each batch. The