Russ J.C. Image Analysis of Food Microstructure

Подождите немного. Документ загружается.

All image analysis systems incorporate some form of thresholding (also some-

times called segmentation, although that term will be reserved here for separation

of touching features). The most basic form of thresholding is simply the manual

setting of a brightness level to distinguish brighter from darker pixels. As shown in

the example of Figure 3.50, this is usually done by referral to the image histogram.

The problem is that the histogram contains very little information to help adjust the

setting properly.

One of the very basic structural relationships presented in the chapter on stere-

ology is that the area fraction of a phase or structure measures the volume fraction.

In the example, the histogram represents only the pixels in the meat, excluding the

black background. The broad peak represents the dark meat and the wide ﬂat shelf

represents the bright fat and bone. It is not obvious where the threshold should be

positioned. Threshold settings in the range from about 105 to 130 (shown on the

histogram) all produce visually plausible binary images, but the area fraction mea-

sured for the meat ranges from 71 to 80%. Using an automatic setting of 110 based

on a statistical test (discussed below) produces a measurement of 73% meat, which

corresponds closely to the value reported by physical tests.

In the more general case, there are two threshold values that can be set to select

a range of brightness values. And in many real situations, as shown in Figure 3.51,

it is the background that is the most uniform and can be selected while the features

include both brighter and darker values. The histogram shows a peak for the grey

(a)

FIGURE 3.50 Example of threshold setting: (a) scanned image; (b) histogram showing a

range of settings from 105 to 130 (on a brightness scale of 0 to 255); (c) thresholded binary

image of fat produced by a setting of 110.

2241_C03.fm Page 185 Thursday, April 28, 2005 10:28 AM

background around the emulsiﬁed fat droplets in this image of milk, but as before

there are no obvious places to set the threshold values. Also note that the resulting

thresholded image has many of the droplets touching each other. Segmentation of

touching features after thresholding is discussed in the next chapter.

The examples of thresholding in textbooks often show an idealized histogram

of the type in Figure 3.52, with well separated peaks that invite the setting of

(b)

(c)

FIGURE 3.50 (continued)

2241_C03.fm Page 186 Thursday, April 28, 2005 10:28 AM

thresholds at the minima in the valleys between the peaks, or sometimes halfway

between the peaks. Sometimes these methods work for materials samples, or for

quality control applications where the goal is reproducibility rather than accuracy,

but they are fundamentally ﬂawed for several reasons. First, few images of food or

other organic material produce such simple multimodal histograms. Usually the

appearance is more like the examples just shown, with poorly deﬁned peaks (or no

peaks at all).

Second, the midpoint between peaks has no particular meaning, because as noted

in the preceding chapter the acquisition device may be linear, or logarithmic, or have

some other output characteristic. And thirdly, the minimum point is not stable but

moves around. Consider the case of a two phase structure with well-deﬁned peaks,

(a) (b)

(c)

FIGURE 3.51 Thresholding using two levels and the uniformity of the background: (a)

original (milk, showing emulsiﬁed fat droplets; courtesy of Ken Baker, Ken Baker Associates);

(b) thresholded binary image inverted to show the droplets; (c) histogram with the upper and

lower threshold settings spanning the grey values for the background.

Pixel Brightness Value

Frequency

064128

192 255

2241_C03.fm Page 187 Thursday, April 28, 2005 10:28 AM

as indicated in Figure 3.53. Without changing the illumination or camera settings,

simply scan the ﬁeld of view to an region of the sample where the area fractions

are different, and as shown in the example the peak’s heights will change, and with

them the location of the minimum. There is no reason for the correct threshold to

vary in such a case, so clearly using the minimum is not a correct strategy.

(a)

(b)

FIGURE 3.52 Example of a micrograph of a metal alloy with three well-deﬁned and homo-

geneous phases, and its histogram showing the three corresponding, well-separated peaks.

Most organic materials do not have such simple brightness distributions.

2241_C03.fm Page 188 Thursday, April 28, 2005 10:28 AM

AUTOMATIC THRESHOLD SETTINGS

USING THE HISTOGRAM

Manual setting of threshold levels is usually accomplished by adjusting sliders

on the histogram while visually observing the image, where some sort of preview

shows which pixels have been selected. Moving the sliders until the result looks

right is difﬁcult, particularly when the image is large and must be scrolled to see

various regions. More important, what looks right to one person may not to another,

or even to the same person on a different day or if the image is rotated. Thresholding

images is the step where most image measurement errors arise, because of incon-

sistent human judgment. Consequently, there has been a continuing effort to ﬁnd

algorithms that can decide where the threshold values should be set.

All of these techniques depend upon knowing something about the image, how

the sample was prepared and the image acquired, and what structures or features

should be present. This information is presumably what the user relies upon in

deciding that the manual settings look right, but they must be made explicit in order

to determine the criteria which an automatic procedure can apply.

The greatest number of automatic thresholding techniques, and certainly the

most widely used methods, apply to the speciﬁc case of printed text on paper (usually

thresholding in this circumstance is a precursor to character recognition and con-

version of the image to a text ﬁle). The independent knowledge about the image in

FIGURE 3.53 Changing the relative peak heights of two structures by varying their area

fraction as described in the text. The peak positions do not change, and the correct threshold

should not change, but the minimum between them shifts by more than 10 brightness levels.

2241_C03.fm Page 189 Thursday, April 28, 2005 10:28 AM

this case is that the image consists ideally of just two kinds of pixels — ones that

are generally bright and correspond to the paper, and ones that are darker and

correspond to the ink. With that criterion, a variety of statistical classiﬁcation meth-

ods have been developed.

Figure 3.54 shows two of the more generally successful and widely used

approaches. Both assume that the histogram actually consists of two distributions,

(a)

(b)

FIGURE 3.54 Example of print on paper: (a) original; (b) histogram with threshold values

selected by the Trussell (T) and Shannon (S) algorithms; (c) Trussell result (detail); (d)

Shannon result (detail).

2241_C03.fm Page 190 Thursday, April 28, 2005 10:28 AM

which may be somewhat overlapped (some ink pixels may be brighter than some

paper pixels) but which can be best separated by setting a single threshold value.

The deﬁnition of best is a function of what statistical test is applied. For example,

the Trussell method uses the statistician’s t-test to compare the two populations

divided by every possible threshold setting to calculate the t-statistic. This is a

function of the mean values, standard deviations, and number of pixels in each

segment of the histogram. When the t-statistic is maximum the probability that the

two populations are different is greatest, so that is the threshold value used.

The Trussell method works quite well for most text-reading applications, and in

fact it is generally applied to many situations where it is known beforehand (or assumed)

that there are just two populations of pixels present. That is often the case for stained

tissue, for example (stained vs. not stained). It may also apply to porous material (solid

or void), and a variety of meat and vegetable products. It is surprising that it works so

well because one of the underlying assumptions in the t-test is that the populations of

pixels have brightness values that have a normal (or Gaussian) distribution, so that the

mean and standard deviation fully characterize the data. Few real images (even ones of

print on paper) present histograms consisting of Gaussian peaks. Note in the ﬁgure that

the histogram has one more-or-less symmetrical peak for the white paper pixels, whereas

the darker ink pixels do not produce a peak at all, but rather a broad sloping shelf with

no obvious place to position a threshold.

There are a variety of nonparametric statistical tests that can be applied to the

two-population model that do not make an assumption of normality, and most of

them have been used to program other threshold-selection algorithms. The Shannon

method, for example, calculates the entropy for the two populations that are treated

as fuzzy sets to determine the threshold setting that minimizes the uncertainty of

the setting. This method selects values that are typically slightly different from the

Trussell method, but also generally satisfactory. All of the examples of thresholding

based on the histogram that follow in this text use either the Trussell or Shannon

method. Many of the other methods (whose mathematical algorithms are summarized

FIGURE 3.54 (continued)

2241_C03.fm Page 191 Thursday, April 28, 2005 10:28 AM

nicely in J. R. Parker, Algorithms for Image Processing and Computer Vision, John

Wiley & Sons, New York, 1997) also work well for the print on paper application,

but sometimes produce quite bizarre results on other images.

Chemical staining of tissue or food samples often produces images that can be

approximated as having two populations of pixels (with and without the stain).

Consequently, the automatic methods just described are often useful for thresholding

these as well. In Figure 3.55 the fat droplets in rhodamine-stained mayonnaise appear

as dark circles. Automatic thresholding distinguishes them from their surroundings.

(a) (b)

FIGURE 3.55 Stained mayonnaise: (a) original (CSLM image courtesy of Anke Janssen,

ATO B.V. Food Structure and Technology); (b) after automatic thresholding.

(a) (b)

FIGURE 3.56 Stained custard: (a) original (see color insert following page 150; CSLM image

courtesy of Anke Janssen, ATO B.V. Food Structure and Technology); (b) red channel selected

for thresholding.

2241_C03.fm Page 192 Thursday, April 28, 2005 10:28 AM

In order to measure their size distribution it is necessary to separate the touching

features. These topics are covered in subsequent chapters.

In many cases, color images produced by chemical staining can be reduced to

a two-population problem, so that automatic thresholding can be applied, by ﬁrst

separating the image into appropriate color channels. For example, the size distri-

bution and clustering of the Nile red stained fat in the custard in Figure 3.56 can be

measured after selecting the red channel for thresholding. The starch granules and

cells in the light micrograph of a stained section of potato (Figure 3.57) can each

be isolated in the green and red channels, respectively.

Although RGB channels can often be used for this purpose, in general it is HSI

space that corresponds better to the discrimination of structures produced by chem-

ical staining. The color of the stain is represented by the hue value, the amount of

stain by the saturation, and the density of the tissue by the intensity. In Figure 3.57(d),

isolating the hue channel produces a grey scale image with much cleaner depiction

of the starch granules than one of the RGB color channels. This is especially true

when two different color stains have been applied. In Figure 3.58 an H&E stain

produces two colors (orange and cyan) that are easily distinguished and automatically

thresholded in the hue channel using the two-population assumption.

(a) (b)

FIGURE 3.57 Stained potato: (a) original (see color insert following page 150); (b) red

channel (showing cell walls); (c) green channel (showing starch granules); (d) hue channel

(better contrast for starch granules).

2241_C03.fm Page 193 Thursday, April 28, 2005 10:28 AM

In several of the examples above, simple thresholding will not produce the ﬁnal

measurable result. For example, the fat droplets in mayonnaise and custard must be

separated (as discussed in the next chapter). In some cases, such as the measurement

of the cell wall area in the potato, the need for subsequent processing of the

(a)

(b) (c)

FIGURE 3.58 Stained intestine: (a) original (see color insert following page 150); (b) hue

channel; (c) automatic threshold setting.

2241_C03.fm Page 194 Thursday, April 28, 2005 10:28 AM