Have to say that ACM even if there is no result, the training of algorithmic ability is beyond doubt ...
Because the teacher drew the key, so tell me the principle of horspool string matching algorithm.
First declare a few concepts, the string being searched is called the matching string, the string to find is called the pattern string, the current and pattern string matches the substring of the matched string is called the matching substring (nonsense
In the naïve algorithm, we want to find a matching string whether there is a pattern string, such as Huangzhi is a genius matching string and Chengjinsen pattern string, then we first use the method is used:
Huangzhi is (character)
to match:
Chengjinsen (character)
If each letter is equal, then we think the pattern is found.
But the efficiency of this algorithm is how much, assuming that the length of the matching string is n, the length of the pattern string is m, then the efficiency of the easy Time is: O (n*m), in many cases this efficiency is very slow.
The Horspool algorithm (in my opinion) has a basic idea: if I knew beforehand that this step could be skipped, then I would not have to verify it.
We took Saber and Archer as a pattern string and matched the string well:
Archer
Saber
Obviously, at the initial alignment, it is not equal, and we notice that the substring corresponding to this match string is: Arche, and the last letter is E.
It is obvious, then, that if we move down to the next e of the pattern string, it is impossible to achieve the matching result.
Here e is the next, so we put saber slightly change to Sebar, this time will find, suppose you move, to E before all are impossible and Arche e match, so here we can directly move E to the location of the Arche e, and then in the matching calculation, This saves a lot of time. That is, skipping the distance between E and the last letter of the pattern string.
The above situation can be summarized as Case 1, that is, the last occurrence of the matching substring in the pattern string, and is not the last letter of the pattern string (the final particularity will be discussed)
So, suppose the last character never appeared in the pattern string? So, assuming that each move, each one must not get the matching result, until the character is not in the matching substring, so, the length of the entire substring is skipped directly.
In case 2, the last letter of the matched substring did not appear in the pattern string and is not the last letter of the pattern string.
Well, why does it have to refer to the alphabet of the pattern string? Because there is a problem with this query method, suppose that the last letter of the matching substring appears in the pattern string, and is the last of the pattern string? That's going to be divided into two situations,
1, the last letter of the pattern string is the only one in the pattern string, which is similar to case 2, skipping the match of the last letter of the matched substring, that is, skipping the character length. Let's call this scenario 3.
2, is that the letter is not unique in the pattern string, so that is the case two, find the end of the pattern string nearest, and the same letter is the position of the letter, and jump after the match. This is what we call the case 4
Then is the computer implementation of the algorithm, summed up the following rules, we found that the pattern string jump mechanism, we can not find the pattern string every time there is the same value, and in a preprocessing way to achieve, according to the summary of the situation one or two, we conclude that the pattern string exists in the letter, To the right to find its closest to the end of the pattern string of the same letter, and put it from the end of the position, convenient application, if not present, the direct jump mode string length, for convenience, we also save it, but the jump length set to the length of the pattern string. According to the 3,4 summary of the situation, we conclude that the last letter should not be included in the above calculation.
Algorithms for incidental tests:
#include <iostream>
#include <algorithm>
#include <vector>
#include <map>
#include <string>
#define INF 0x3f3f3f3f
using namespace Std;
int main () {
String M, P;
CIN >> M >> P;
Map<char, int>next;
for (int i = 0; i <; i++)
Next[' A ' + i] = P.size ();
for (int i = 0; i <; i++)
next[' A ' + i] = P.size ();
for (int i = 0; i < p.size ()-1; i++)
Next[p[i]] = p.size ()-i-1;
int space = 0;
for (int i = 0; I <= m.size ()-p.size ();)
{
Space = i;
cout << M << Endl;
for (int i = 0; i < space; i++)
cout << "";
cout << P << endl << "_________________________________" << Endl;
if (M.substr (i, p.size ()) = = = P)
cout << "Find matching string" << Endl;
i + = Next[m[i + p.size ()-1]];
}
}
Horspool algorithm-string matching