27
RNA Structure Prediction
possible secondary structure into smaller components in a unique
way, and this unequivocal decomposition is the base of the
McCaskill algorithm. For details, see also refs.(9) and (11).
Rivas and Eddy published a dynamic programming algorithm for
predicting any planar pseudoknotted structure (16, 17). The idea
is that they calculate the best possible secondary structure for any
pair of substrings. A planar pseudoknotted secondary structure
can always be decomposed into two smaller parts such that the
smaller parts also contain planar pseudoknotted structures, see
Fig. 5. The two substrings can be described by the beginnings
and ends of the two substrings, i, j, k, and l; therefore, it needs an
O(L
4
) memory usage. For each pair of substrings, two cutting
points, r and s, are needed to split the structure into two parts.
Hence, the overall running time of the algorithm is O(L
6
).
There are special algorithms that run in only O(L
5
) running
time; however, these algorithms can predict only some special
pseudoknots and cannot consider all possible planar pseudoknot-
ted secondary structures (18–20). The partition function can also
be calculated for several pseudoknot model (21). Interested read-
ers are referred to a review paper by Reeder and Giegerich (22).
The general pseudoknot prediction problem is NP-hard. The first
proof for NP-hardness was given by Lyngsoe and Pedersen in
(19, 20). Lyngsoe (23) considered three very simple models. In
these models, the best structure is that maximizes:
1. The number of base pairs
2. The number of base pair stackings
3. The number of stacking base pairs
The difference between the two later models is that the score of
an m long helix is m − 1 when counting the number of base pair
stackings, while the score is m when counting the number of
stacking base pairs, m > 1.
The first approach that maximizes the number of base pairs
is equivalent to the Maximum Weighted Matching problem.
2.4. Predicting
Pseudoknots
2.5. The General
Pseudoknot Problem
Fig. 5. The schematic representation of the Rivas–Eddy dynamic programming
algorithm that can predict arbitrary planar pseudoknots. The algorithm obtains the best
secondary structure for each pair of substrings. Due to limited space, the case when
both r and s are in the interval [l,k] is not indicated