data:image/s3,"s3://crabby-images/3051a/3051a9a958601cbf07d100cf198c893259ef2853" alt=""
826 S Sequential Multiple String Matching
siders sparse q-grams and thus avoids to scan a lot of text
positions. It is due to Fredriksson and Grabowski [7].
Applications
The methods which are described here apply to the treat-
ment of the natural language, the treatment and analysis
of genetic sequences and of musical sequences, the prob-
lems of safety related to data flows like virus detection, and
the management of textual data bases, to quote only some
immediate applications.
Open Problems
There remain only a few open problems on this question.
It is still unknown if it is possible to design an average op-
timal time constant space string matching algorithm. The
exact size of the Boyer-Moore automaton is still unknown
(see [5]).
Experimental Results
The book of G. Navarro and M. Raffinot [12] is a good in-
troduction and presents an experimental map of ESM al-
gorithms for different alphabet sizes and pattern lengths.
Basically, the Shift-Or algorithm is efficient for small al-
phabets and short patterns, the BNDM algorithm is effi-
cient for medium size alphabets and medium length pat-
terns, the Horspool algorithm is efficient for large alpha-
bets, and the BOM algorithm is efficient for long patterns.
URL to Code
The site monge.univ-mlv.fr/~lecroq/string presents
a large number of ESM algorithms (see also [2]). Each
algorithm is implemented in C code and a Java applet is
given.
Cross References
Indexed approximate string matching refers to the case
where the text is preprocessed;
Regular expression matching is the more complex case
where P can be a regular expression.
Sequential approximate string matching is the version
where errors are permitted;
Sequential multiple string matching is the version
where a finite set of patterns is searched in a text;
Recommended Reading
1. Allauzen, C., Crochemore, M., Raffinot, M.: Factor oracle: a new
structure for pattern matching. In: SOFSEM’99. LNCS, vol. 1725,
pp. 291–306. Springer, Berlin (1999)
2. Charras, C., Lecroq, T.: Handbook of exact string matching al-
gorithms. King’s College London Publications, London (2004)
3. Cole, R., Hariharan, R., Paterson, M., Zwick, U.: Tighter lower
bounds on the exact complexity of string matching. SIAM
J. Comput. 24(1), 30–45 (1995)
4. Crochemore, M., Czumaj, A., G ˛asieniec, L., Jarominek, S.,
Lecroq,T.,Plandowski,W.,Rytter,W.:Speedinguptwostring
matching algorithms. Algorithmica 12(4/5), 247–267 (1994)
5. Crochemore, M., Hancart, C., Lecroq, T.: Algorithms on strings.
Cambridge University Press, New York (2007)
6. Crochemore, M., Perrin, D.: Two-way string matching. J. ACM
38(3), 651–675 (1991)
7. Fredriksson, K., Grabowski, S.: Practical and optimal string
matching. In: Proceedings of SPIRE’2005. LNCS, vol. 3772,
pp. 374–385. Springer, Berlin (2005)
8. Galil, Z., Seiferas, J.: Time-space optimal string matching.
J. Comput. Syst. Sci. 26(3), 280–294 (1983)
9. Gusfield, D.: Algorithms on strings, trees and sequences. Cam-
bridge University Press, Cambridge, UK (1997)
10. Hancart, C.: On Simon’s string searching algorithm. Inf. Process.
Lett. 47(2), 95–99 (1993)
11. Knuth, D.E., Morris, J.H. Jr., Pratt, V.R.: Fast pattern matching in
strings. SIAM J. Comput. 6(1), 323–350 (1977)
12. Navarro, G., Raffinot, M.: Flexible Pattern Matching in Strings –
Practical on-line search algorithms for texts and biological se-
quences. Cambridge University Press, Cambridge, Uk (2002)
13. Rytter, W.: On maximal suffixes and constant-space linear-time
versions of KMP algorithm. Theor. Comput. Sci. 299(1–3), 763–
774 (2003)
14. Smyth, W.F.: Computing Patterns in Strings. Addison Wesley
Longman, Harlow, UK (2002)
15. Yao, A.: The complexity of pattern matching for a random
string.SIAMJ.Comput.8, 368–387 (1979)
Sequential Multiple String Matching
1999; Crochemore, Czumaj, G¸asieniec, Lecroq,
Plandowski, Rytter
MAXIME CROCHEMORE
1,2
,THIERRY LECROQ
3
1
Department of Computer Science,
Kings College London, London, UK
2
Laboratory of Computer Science,
University of Paris-East, Paris, France
3
Computer Science Department and LITIS Faculty
of Science, University of Rouen, Rouen, France
Keywords and Synonyms
Dictionary matching
Problem Definition
Given a finite set of k pattern strings
P = fP
1
; P
2
;:::;P
k
g
and a text string T = t
1
t
2
:::t
n
, T and the P
i
sbeingse-
quences over an alphabet ˙ of size ,themultiple string
matching (MSM) problem is to find one or, more gener-
ally, all the text positions where a P
i
occurs in T.More