Statistics and Data Analysis in Geology
-
Chapter
4
in units having magnitude.
A
depth of
3000
ft in a well is ten times a depth of
300
ft, and the decade between the years 1940 and
1950
has the same duration as the
interval between
1950
and 1960. These may seem obvious or even trivial points to
emphasize, but as we shall see, not all geologic sequences have such well-behaved
characteristics.
At the opposite extreme, we
can
consider a stratigraphic sequence consisting
of the lithologic states encountered in a sedimentary succession. Such a sequence
might be a cyclothem of
limestone-shale-limestone-shale-sandstone-coal-shale-
limestone, from bottom to top.
We
are
interested in the significance of the succes-
sion, but we cannot put a meaningful scale on the sequence itself. Obviously, the
succession
of
lithologies represents changes that occurred through time, but we
have no way of estimating the time scale involved. We could use thickness, but this
may change dramatically from location to location even though the sequence is not
altered.
If
thickness is considered, it may obscure our examination of the succes-
sion, which is the subject of
our
interest. Thus, the
fact
that limestone is the third
state in the section and coal is the sixth has no significance that
can
be expressed
numerically (that is, position 6 is not “twice” position
3).
Likewise, the lithologic
states of the units cannot be expressed on a numerical scale. We might code the
sequences just given as
1
-
2
-
1
-
2
-
3
-
4
-
2
-
1,
where limestone is equated to
1,
shale is
2,
sandstone is
3,
and coal is
4,
but such a convention is purely arbitrary
and expresses no meaningful relations between the states. It is obvious that this
sequence poses different problems to the analyst than do the first examples.
There also are intermediate possibilities.
For
example, we may be interested in
some measurable attribute contained in successive stages of a sequence. Perhaps
we have measured the boron content of each lithologic unit in the cyclothem just
discussed. We
can
utilize a distance scale of feet between samples and consider
this a problem related to depth
or
distance. Alternatively, we
can
consider the
relationship between the boron measurements and the sequence of states.
A
closely related problem is the analysis of a sequence characterized by the
presence
or
absence of some variable
or
variables at points along a line. We might
be interested, for example, in the repeated recurrence of certain environment-
dependent microfossils
in
the chips recovered during the drilling of a well. Another
class of problems may be typified by the succession of mineral grains encountered
on traverses across a thin section.
In
this case, we can use millimeters as a conve-
nient spatial scale, but we have no way of evaluating whether olivine rates a higher
number than plagioclase.
Data having the characteristic of being arranged along a continuum, either of
time
or
space, often
are
referred to as forming a series, sequence, string,
or
chain.
The nature of the data and the chain determine the questions that we can consider.
Obviously, we cannot extract information about time intervals from stratigraphic
succession data, because the time scale accompanying the succession is not known.
We often substitute spatial scales for a time scale in stratigraphic problems, but our
conclusions are no better than our fundamental assumptions about the length of
time required to deposit the interval we have measured.
Table
4-1
is a classification of the various data-analysis techniques discussed
in
this chapter. We can consider two types of sequences. In the first, the distance
between observations varies and must be specified for every point. In the second,
the points are assumed to be equally and regularly spaced; the numerical value
of the spacing does not enter into the analyses except as a constant.
A
subset
of
160