values (X
1
, X
2
etc) divided by the population size (N). The formula for the
mean is:
¼
P
N
i¼1
X
i
N
(7:1)
This formula needs some explanation. It contains some common standard
abbreviations and symbols. First, the symbol Σ means “the sum of” and the
symbol X
i
means “All the X values specified by the restrictions listed below
and above the Σ symbol.” The lowest value of i is specified underneath Σ
(here it is 1, meaning the first value in the data set for the population) and
the highest is specified above Σ (here it is N, which is the last value in the
data set for the population). The horizontal line means that the quantity
above this line is divided by the quantity below. Therefore, you add up all
the values (X
1
to X
N
) and then divide this number by the size of the
population (N).
(Some textbooks use Y instead of X. From Chapter 3 you will recall that
some data can be expressed as two-dimensional graphs with an X and Y
axis. Here we will use X and show distributions with a mean on the X axis,
but later in this book you will meet cases of data that can be thought of as
values of Y with distributions on the Y axis.)
As a quick example of the calculation of a mean, here is a population of
only four fossil snails (N = 4). The shell lengths in mm of these four individ-
uals (X
1
through to X
4
) are 6, 7, 9 and 10, so the mean, μ, is 32 ÷ 4 = 8 mm.
7.3.2 The variance of a population
The mean describes the location of the center of the normal distribution, but
two populations can have the same mean but very different dispersions
around their means. For example, a population of four snail fossils with shell
lengths of 1, 2, 9 and 10 mm will have the same mean, but greater dispersion,
than another population of four with shell lengths of 5, 5, 6 and 6 mm.
There are several ways of indicating dispersion. The range, which is just
the difference between the lowest and highest value in the population, is
sometimes used. However, the variance, symbolized by the Greek σ
2
,
provides a lot of information about the normal distribution that can be
used in statistical tests.
68 Working from samples: data, populations and statistics