Efficient String Matching: An Aid to Bibliographic Search

This paper describes a simple, efficient algorithm
to locate all occurrences of any of a finite 
number of keywords in a string of text.  The algorithm
consists of constructing a finite state pattern 
matching machine from the keywords and then using the
pattern matching machine to process the text string 
in a single pass.  Construction of the pattern matching
machine takes time proportional to the sum of 
the lengths of the keywords.  The number of state transitions
made by the pattern matching machine in 
processing the text string is independent of the number
of keywords.  The algorithm has been used to 
improve the speed of a library bibliographic
search program by a factor of 5 to 10.

CACM June, 1975

Aho, A. V.
Corasick, M. J.

keywords and phrases, string pattern matching, bibliographic
search, information retrieval, text-editing, 
finite state machines, computational complexity.

3.74 3.71 5.22 5.25

CA750607 JB January 9, 1978  1:03 PM

2532	4	2746
2545	4	2746
2631	4	2746
2733	4	2746
2746	4	2746
2746	4	2746
2746	4	2746
2746	4	2746
2746	4	2746
3001	4	2746
1665	5	2746
1739	5	2746
2139	5	2746
2545	5	2746
2746	5	2746
2746	5	2746
2746	5	2746
2786	5	2746
2916	5	2746
2746	6	2746