276 10 Feature Reduction
10.2.4.3
Use of Transformed Divergence in Clustering
One of the last stages in a practical clustering algorithm is to evaluate the size and
relative locations of the clusters produced, as noted in Chap. 9. If clusters are too close
to each other they should be merged. The availability of the information in Fig. 10.5
allows merging to be effected based upon a pre-specified transformed divergence,
since both cluster mean and covariance data is normally available. By establishing
a desired accuracy level (in fact upper bound) for the subsequent classification and
then determining the corresponding value of transformed divergence, clusters with
separabilities less than this value must be merged.
10.3
Separability Measures for Minimum Distance Classification
The separability measures of Sect. 10.2 relate to spectral classes modelled by mul-
tivariate normal distributions, in preparation for maximum likelihood classification.
Should another classifier be used this procedure is unduly complex and largely with-
out meaning. For example, if supervised classification is to be carried out using
the minimum distance to class means technique there is no advantage in using
distribution-based separability measures, since probability distribution class mod-
els are not employed. Instead it is better to use a simple measure consistent with the
nature of the classification algorithm. For minimum distance calculation this would
be a distance measure, computed according to the particular distance metric in use.
Commonly this is Euclidean distance. Consequently, when a set of spectral classes
has been determined, ready for the classification step, the complete set of pairwise
Euclidean distances will provide an indication of class similarities. Unfortunately this
cannot be related to an error probability (for misclassification) but finds application
as an indicator of what pairs of classes could be merged, if so desired.
10.4
Feature Reduction by Data Transformation
The emphasis of the preceding sections has been feature selection – i.e., an evaluation
of the existing set of features for the pixel data in multispectral imagery with a view
to selecting the most discriminating, and discarding the rest. It is also possible to
effect feature reduction by transforming the data to a new set of axes in which
separability is higher in a subset of the transformed features than in any subset of
the original data. This allows transformed features to be discarded. A number of
image transformations could be entertained for this; however the most commonly
encountered in remote sensing are the principal components or Karhunen-Loève
transform and the transformation associated with so-called canonical analysis. These
are treated in the following.