The
Rate
of
Neutral
Substitution
Can
Be Measured
from
Divergence
of Repeated
Sequences
.
The rate
of substitution
per
year
at
neutral
sites is
greater
in the mouse than in
the
human
genome.
We can make the best estimate
of the rate of
substitution
at neutral
sites by examining
sequences that do not code for
protein. (We
use
the term
neutral here rather
than
silent, because
there is no coding
potr:ntial).
An informative
comparison can be made by comparing
the
members of a common repetitive family in
the
human
and
mouse
genomes.
The
principle
of the analysis
is summarized
in
'iir--i::ii:
ir
,
:.
\A'/s
Start with a family of related
sequences that
have
evolved by duplication
and
substitution
from
an original family member.
We assume that the cornmon
ancestral sequence
can be deduced by taking
the base that is most
common at each
position.
Then
we can calcu-
late the divergence
of each individual family
member
as the
proportion
of bases
that differ
from the deduced ancestral
sequence. In this
example,
individual members
vary from 0.13
to 0.18 divergence and the average is 0.16.
One family used for
this analysis
in
the
human and mouse
genomes
derives from a
sequence that is thought to have
ceased to be
active at
about
the time of the divergence
between man and rodents
(the
LINES family;
see Section
22.9, Retroposons
Fall into Three
Classes). This
means
thLat it has
been diverging
without any selective
pressure
for
the same
length
of time
in
both species. Its average diver-
gence
in man is
-0.17
substitutions
per
site, cor-
responding
to a
rate
oI
2.2
x
I0-e substitutions
per
base
per year
over tl:re 75 million
years
since
the separation. In the rrrouse
genome,
however,
neutral substitutions have occurred
at twice this
rate,
corresponding to, 0.34 substitutions
per
site in the family, or a rate of 4.5
x
l0-e.
Note,
however, that if we calculated
the rate
per gen-
eration
instead
of
per
1,ear,
it would be
greater
in man than in mouse
(-2.2
x
l0-8
as opposed
to
-10
e).
cCCAGCGTAGCTTICATTACCCGTACGTTCATATTCGG
7138=0.18
cCTGGCGTAGC.TACGTTAGCGGTACGTGCATATTGGG
6/38=0.16
GGTAGCCTAaCTTAaGaTACCGGT!CGTGCTTGTTCGG
6/38=0.16
GGTAGCCTAGCTTAGGTTATTGGTAGGTGCATGTCCGG 6/38=0.16
cCTACCCTAGGTTACGTTATCGGTACGTGTCCGTTCGG
6/38=0.16
GCCACCC'AGCTCACGTTACCGGCACGTGCATGATCGC
7/38=0.18
CCTAGCCTCGCTTTCGTTAGCGGTACCTGCATCTTCCG
7/38=0,18
GCTTGCCTAGTTTACGTTACTGGTACGCGCATGTTGGG
5i38=0,13
GCCAGGCTAGCTTACGCCACCGGTACGTGGATGTCCGG
6/38--0.16
A
T
Calculate
Calculate consensus
sequence
divergence
I
trot
I
V
consensus
GCTAGCCTAGCTTACGTTACcGGTACGTGCATGTTCGG
SEqUENCE
i
I
r,ilrr'ir
r'r
.i
r
An
ancestraI
consensus
sequence
fora
familyis cat-
cutated by taking the
most common
base
at each
position.
The
divergence of each existing
current
member
of the fami[y
is ca[-
cutated as the
proportion
of
bases at
which it differs
from the
ancestraI sequence.
These figures
probably underestimate
the
rate of substitution
in the
mouse; at
the time of
divergence
the rates
in both species
would
have
been the same,
and the
difference
must
have
evolved
since then.
The current
rate of
neutral
substitution
per year in the
mouse is
probably
2-)x
grearer
than
the
historical average.
These
rates reflect the balance
between
the occurrence
of mutations
and the ability
of
the
genetic
sys-
tem of the organism
to
correct them.
The
dif-
ference between the
species
demonstrates
that
each species
has systems
that
operate
with a
characteristic ef
f iciency.
Comparing
the
mouse
and
human
genomes
allclws
us to assess
whether
syntenic
(corresponding)
sequences
show signs
of con-
servation or
have differed
at
the rate
expected
from accumulation
of
neutral
substitutions.
The
proportion
of
sites that
show
signs of
selection is
-5o/".
This
is much
higher
than
the
proportion
that
codes
for
protein
or
RNA
(-l'/.).It
implies that
the
genome includes
many
more stretches
whose
sequence
is
important
for noncoding
functions
than for
coding functions.
I(nown
regulatory
elements
are
likely to comprise
only
a small
part
of this
proportion. This number
also
suggests
that
most
(i.e.,
the
rest) of
the
genome
sequences
do not
have any function
that depends
on
the
exact seouence.
6.5
The Rate of Neutral Substitution
Can
Be
Measured
from
Divergence of
Repeated
Sequences
707