8.2 Maximum Likelihood Classification 199
As an example of how this is used consider the need to choose a threshold such
that 95% of all pixels in a class will be classified (i.e. such that the 5% least likely
pixels for each spectral class will be rejected). χ
2
tables show that 95% of all pixels
have χ
2
values (in Fig. 8.2) less than 9.488. Thus, from (8.10)
T
i
=−4.744 −
1
2
ln |Σ
i
|+ln p(ω
i
)
which thus can be calculated from a knowledge of the prior probability and covariance
matrix of the ith spectral class.
8.2.6
Number of Training Pixels Required for Each Class
Sufficient training pixels for each spectral class must be available to allow reasonable
estimates to be obtained of the elements of the class conditional mean vector and
covariance matrix. For an N dimensional multispectral space the covariance matrix
is symmetric of size N × N . It has, therefore,
1
2
N(N + 1) distinct elements that
need to be estimated from the training data. To avoid the matrix being singular at
least N(N + 1) independent samples is needed. Fortunately, each N dimensional
pixel vector in fact contains N samples (one in each waveband); thus the minimum
number of independent training pixels required is (N +1). Because of the difficulty in
assuring independence of the pixels, usually many more than this minimum number
is selected. Swain and Davis (1978) recommend as a practical minimum that 10N
training pixels per spectral class be used, with as many as 100N per class if possible.
For data with low dimensionality (say up to 5 or 6 bands) those numbers can usually
be achieved, but for hyperspectral data sets finding enough training pixels per class
is extremely difficult. Section 13.5 considers this problem in some detail.
8.2.7
A Simple Illustration
As an example of the use of maximum likelihood classification, the segment of
Landsat multispectral scanner image shown in Fig. 8.3 is chosen. This is a 256 ×276
pixel array of image data in which four broad ground cover types are evident. These
are water, fire burn, vegetation and “developed” land (urban). Suppose we want to
produce a thematic map of these four cover types in order to enable the area and
extent of the fire burn to be evaluated.
The first step is to choose training data. For such a broad classification, suitable
sets of training pixels for each of the four classes are easily identified visually in the
image data. Figure 8.3 also shows the locations of four training fields used for this
purpose. Sometimes, to obtain a good estimate of class statistics it may be necessary
to choose several training fields for the one cover type, located in different regions
of the image.
The four-band signatures for each of the four classes, as obtained from the training
fields, are given in Table 8.1. The mean vectors can be seen to agree generally