6 1 Introduction
Generally, a multiple sequence (or group of sequences) would be denoted
as:
A = { A
1
,A
2
, ··· ,A
m
} , (1.2)
in which each A
s
is a separate sequence defined on V
q
, and its complete ex-
pression is
A
s
=(a
s,1
,a
s,2
, ··· ,a
s,n
s
) ,s=1, 2, ··· ,m , (1.3)
where n
s
is the length of the sequence A
s
,andm is the number of sequences
in each group.
Classification of Biological Sequences
The primary structure of a biological sequence specifies its component
nucleotides or amino acids. The tertiary or three-dimensional structure of
a biological sequence describes the three-dimensional arrangement (position
coordinates) of the constituent atoms in the molecule. The secondary struc-
ture of a biological sequence describes its local properties. For example, the
secondary structure of a protein denotes the special structures (motifs) of each
protein segment, where a helix, strand or other structure might exist. Super-
secondary structure is also frequently used to describe an intermediate state
between the secondary structure and the tertiary structure, which consists of
some larger compact molecular groups (domains).
Modern molecular biology tells us that DNA (or RNA) sequences and
protein sequences are the basic units involved in special biological functions.
Their functional characteristics not only involve their primary structure, but
also their three-dimensional shapes. For example, the binding pockets of a pro-
tein play an important role in controlling its functions. Thus, the shape formed
by the amino acid sequences in three-dimensional space can become highly
relevant to the clinical treatment involving a serious genetic mutation present
in a disease. We will use the configuration of a protein to replace the shape
of the protein in three-dimensional space. Since the mutation of a biological
sequence changes its configuration and therefore may affect its function, and
since alignment is the most popular method for scanning the mutation posi-
tions, we begin by discussing the basic characteristics of mutations, as well as
alignment methods for biological sequences.
1.1.2 Definition of Mutations and Alignments
The success of cloning demonstrates that a DNA sequence contains the com-
plete information regarding the construction of a life form. However, there
are many complex processes that must occur when building structures from
DNA to RNA, RNA to protein, protein to organelle, organelle to cell and,
finally, from cell to organism. Some of these processes include transcription,
translation, and duplication. Within these processes, the mechanisms of recog-
nition, regulation, and control are still not entirely clear. There remain a great