The RABIN-KARP algorithm has good practicability for random string matching problems. It is based on the fingerprint mentality.
The main string length is n mode string length is M
Assume
※① we can calculate a P's fingerprint at O (m) time F (P)
※② if F (P) is not equal to F (t[s). S+M-1]) then p must not be equal to t[s. S+M-1]
※③ we can compare fingerprints at O (1) time
※④ we can at O (1) time from F (t[s. S+M-1]) calculation f (t[s+1..s+m])
The fingerprint can be seen as a decimal number, the key to the algorithm is whether it can be in O (1) time from F (t[s. S+M-1]) calculation f (t[s+1..s+m])
If the fingerprint is very large, you can consider a hash of the digital control in a large prime number Q.
i.e. ft = (ft-t[s]*10^ (m-1) mod q) *10+t[s+m]) mod q can be completed within O (1)
where 10^ (m-1) mod q can be calculated once in preprocessing
Pseudo code
Rabin-karp-search (t,p) { /** q is a larger prime number than M */ /** C is a processed ten (m-1) mod q */ int fp=0,ft=0; for (int i = 0; i < m; i + +) { fp = (10*fp+p[i])%q; FT = (10*ft+t[i])%q; } for (int s = 0; s <= n-m; s + +) { if (fp = = ft) Here The comparison is really the same, if the same direct return; FT = ((ft-t[s]*c) *10+t[s+m])%q; } return-1;/** Search failed */}
Rabin-karp algorithm and fingerprint thought