2.4 Type-II Mutated Sequences 51
2.4.1 Description of Type-II Mutated Sequences
A type-II mutation (defined in Sect. 1.2) refers to the permutation of some
segments of a biological sequence A =(a
1
,a
2
, ··· ,a
N
). For example,
A = (00201[332]0110203[01022]23101011[20]3321) ,
B = (00201[20]0110203[332]23101011[01022]3321) .
(2.63)
Then, in sequence A the data [332], [01022], [20] in the square brackets permute
and turn into the segments [20], [332], [01022] of sequence B. Data permutation
on more disconnected segments is very important in gene or protein analysis.
In recent years, bioinformatics has begun to solve these problems. We do not,
however, intend to address the subject in this book due to its complexity.
In this book, we confine our discussion to simpler cases. That is, we only
discuss data permutation of two coterminous segments. For example,
A = (00201{[332][0110](00201{[332][0110]}20301022231{[01011][20]}3321) ,
B = (00201{[0110][332]}20301022231{ [20][01011]}3321) .
(2.64)
The sequence B results from the permutation of the data segments in large
brackets {[332][0110]}, {[01011][20]} of sequence A.Afterthis,thenewseg-
ments of sequence B are {[0110][332]}, {[20][01011]}, in which each large
bracket contains the permutation of two coterminous segments.
2.4.2 Stochastic Models of Type-II Mutated Sequences
The following assumptions are required in order to build models of type-II
mutated sequences:
II-1 The mutation sequence ˜η is determined by a stochastic sequence
˜
ξ,
˜
ζ
2
,
and (
˜
∗
1
,
˜
∗
2
). The explanation is as follows:
1.
˜
ξ is a perfectly stochastic sequence on V
4
. It is an initial sequence
to be mutated.
2.
˜
ζ
2
is a Bernoulli process with strength
2
. It is similar to the se-
quence defined in (2.38), to describe whether or not type-II muta-
tion happens.
3. (
˜
∗
1
,
˜
∗
2
) is a stochastic sequence to describe the permutation length
of the type-II mutation, in which
˜
∗
τ
=(
∗
τ,1
,
∗
τ,2
,
∗
τ,3
, ···) ,τ=1, 2 , (2.65)
are two independently and identically distributed stochastic se-
quences and each
∗
τ,j
obeys a geometric distribution:
P
r
{
∗
τ,j
= k} = e
p
τ
(k)=p
τ
(1 − p
τ
)
k−1
,τ=1, 2 . (2.66)