other, the most recent parts of the bottom core will occur adjacent to
older and older parts of the top one, so if a pattern occurs within a
sequence then the similar or dissimilar sections will, at some stage, lie
side by side (Figure 21.2(g)).
This method is straightforward, but an essential assumption is that
samples have been taken at regular intervals throughout the sequence
(e.g. usually an equal length increment in geological settings). If the intervals
are unequal, then it may be possible to obtain a regular sequence by excluding
some data.
It would be very time consuming to visually inspect the two sequences
every time they were shifted. Furthermore, you need some way of deciding
whether any similarity or dissimilarity is significant or whether it might
only be occurring by chance within a sequence of random numbers. This
can be done by using autocorrelation (which is sometimes called serial
correlation) to test for a relationship, without assuming dependence or
causality. As described above, a sequence is copied to give two identical
ones which are then placed side by side (Figure 21.2(a)). The values
adjacent to each other will be the same, so at this stage the correlation
(Chapter 15) between the variables “sequence 1” and “sequence 2” will
always be 1.0.
Next, sequence 1 is shifted only one interval to the right (Figure 21.2(b)).
This shift is called a lag interval of one (or just a lag of one), and it places
every value within sequence 2 adjacent to the value recorded at the previous
interval in sequence 1. The correlation is recalculated. The process is
repeated several times: the sequence is shifted another interval in the same
direction (therefore giving lag intervals of two, three, four etc.) and the
correlation recalculated each time (Figure 21.2(c)–(g)). The number of lags
that can be used will be limited by the length of a finite sequence, because
every successive shift will reduce the length of the overlapping section
by one.
If there is marked similarity within the sequence then the correlation at
some lag intervals will be strongly positive (e.g. Figure 21.2(g)).
If there is no marked similarity or dissimilarity and only random
variation, the correlation will show some variation but have a mean of zero.
If the pattern at a particular lag in one sequence is the opposite of the
other and therefore markedly dissimilar, the correlation will be strongly
negative (e.g. Figure 21.2(d)).
21.4 Within-sequence similarity/dissimilarity 301