Overview
When it comes to string matching algorithms, the first image of the brain, mostly the KMP algorithm, because there are general textbooks, but the KMP algorithm for my brain is not very useful, read a period of time to forget. Sunday is also a string matching algorithm, faster than the KMP,BM algorithm, the key is the principle of simple.
Problem description
First of all, let's talk about the problem solved by the Sunday algorithm: give two strings A, B, ask if there are any strings in string a
For example, give you an article to judge whether there is a "I love my Home" sentence in the article.
Principle
The simplest way is definitely a brute force match, from the beginning of the article, first look at the first 5 words is not "I love my Home", if not, move backward one, see 2nd to 6th Word is not "I love my Home", and then go on. Such a method must be able to find out, but the efficiency is too low. The idea of the Sunday algorithm is to skip as many characters as possible when the match is unsuccessful.
When the match is unsuccessful, we look at the next character at the end of the substring, denoted by C,
Situation one: If C does not appear in B, then the next time we compare, we can jump directly to the next character of C start
Case two: If C appears in B, we use the B in the top C with it, so before matching, we have to pre-processing B, to determine that when C is a character in B, the number of steps that need to be moved
For example, "I love my Home", the result of our preprocessing is (character: steps) {"I": 3, "Love": 4, "of": 2, "Home": 1}
Here is a concrete example
A: My home in the sun, the sun gave me strength, so I love my home, I love the Sun
B:-I love my home.
First comparison, is not successful, we are looking at the next character C for "Yang", the obvious "yang" word is not in B, so is the above case one, then we move the B character, the next character on its Yang, ","
A: My home in the sun, the sun gave me strength, so I love my home, I love the Sun
B: ——————— – I love my home
There is no matching success, see the next character C for "I", appear in B, through our preprocessing results, the "I" word move 3 positions, OK we move three positions
A: My home in the sun, the sun gave me strength, I love my home, I love the Sun
B: —————————————-I love my home.
Still no match succeeds, the next word is "Love", appearing in B, the preprocessing result tells us to move 4 positions
A: My home in the sun, the sun gave me strength, I love my home, I love the Sun
B: —————————————————— I love my home.
Match success
We found it through 4 comparisons, simple violence.
Introduction to The Sunday algorithm