Classic KMP algorithm C + + and Java implementation code

Source: Internet
Author: User

Objective:

KMP algorithm is a kind of string matching algorithm, which is discovered by Knuth,morris and Pratt simultaneously (called KMP algorithm). The key of KMP algorithm is to reduce the number of matches between the pattern string and the main string so as to achieve fast matching by using the information after the match failure. The popular practice is to implement a next () function, which itself contains local matching information for the pattern string. Because the next function is not easy to understand, this article is also based on the space-time approach, but will be implemented in another code, the hope is more convenient for the reader to understand!

Test data

Aseeesatba   esatas330kdwejjl_8   jjl_faw4etoesting TIOAABACB Abac

Test results

49-10

(Note: If the match returns the start index of the text substring; otherwise returns-1)

1. The realization of a violent search
1 //Violent substring look up one type: O (m*n)2     Private Static intsearch0 (string text, String pat) {3         intI, J, N = Text.length (), M =pat.length ();4          for(i = 0; I <= n-m; i++) {5              for(j = 0; J < M; J + +) {6                 if(Text.charat (i + j)! =Pat.charat (j))7                      Break;8             }9             if(M = =j)Ten                 returni; One         } A         return-1; -}

The function passes in the text and pattern string Pat, where I and i+j mark the end and end of the text substring, respectively. If text has substring matching pat, it returns the text substring starting with index; 1; time complexity: O (M*n)

2. Brute force search implementation two
1 //Violent substring Lookup two-type: O (m*n)2      Public Static intSearch (string text, String pat) {3         intI, J;4         intN = Text.length (), M =pat.length ();5          for(i = 0, j = 0; I < N && J < M; i++) {6             if(Text.charat (i) = =Pat.charat (j))7J + +;8             Else {9I-=J;Tenj = 0; One             } A         } -         return(j = M)? (i-m): 1; -}

The same brute force search algorithm is judged by the "I" in the continuous backtracking text string. If text has substring matching pat, it returns the text substring starting with index; 1; time complexity: O (M*n)

3.KMP Algorithm (space change time)

To optimize the algorithm's time complexity, we tried to store some information and introduced additional space storage dfa[][].

From the second type of brute force search algorithm described above, we can be inspired. That is, by recording "J" to ensure that "I" can only move to the right, no need to go back to the left. Among them, Dfa[i][j]

Represents the current character ' charAt (i) ' in the text string, where the next text character ' CharAt (i+1) ' should match the pattern string (0~j).

Here we introduce the numerical initialization of the finite automaton DFA to dfa[][]. Take the pattern string "AABACB" as an example to match Pat's DFA status graph as follows:

The corresponding code is as follows:

1         //Construction dfa[][]2Dfa[pat.charat (0)][0] = 1;3          for(intx=0,j=0;j<m;j++){4              for(intC=0;c<r;c++){5DFA[C][J] =Dfa[c][x];6             }7Dfa[pat.charat (j)][j] = j+1;8X =Dfa[pat.charat (j)][x];9}

Where "X" represents a different DFA state, the time complexity of the code constructs dfa[][] is: O (n*r);

------------------------------------------------

Java full code

1  Packagech05.string.substring;2 3 ImportJava.io.File;4 ImportJava.util.Scanner;5 6  Public classKMP {7     8     Private intR = 255;9     PrivateString Pat;Ten     Private int[] [] DFA; One      A      PublicKMP (String Pat) { -          This. Pat =Pat; -         intM =pat.length (); theDFA =New int[r][m]; -          -         //Construction dfa[][] -Dfa[pat.charat (0)][0] = 1; +          for(intx=0,j=0;j<m;j++){ -              for(intC=0;c<r;c++){ +DFA[C][J] =Dfa[c][x]; A             } atDfa[pat.charat (j)][j] = j+1; -X =Dfa[pat.charat (j)][x]; -         } -          -     } -      in      Public intSearch (String text) { -         inti,j; to         intN = Text.length (), M =pat.length (); +          for(I=0,j=0;i<n && j<m; i++){ -j =Dfa[text.charat (i)][j]; the         } *         returnJ==m? (i-m): 1; $     }Panax Notoginseng      -      Public Static voidMain (string[] args)throwsException { the         //reading data from a file +Scanner input =NewScanner (NewFile ("Datain.txt")); A          while(Input.hasnext ()) { theString Text =Input.next (); +KMP KMP =NewKMP (Input.next ()); -             intAns =kmp.search (text); $             //Output Answer $ System.out.println (ans); -         } -     } the}

------------------------------------------------

Complete code for C + +

1#include <cstdio>2#include <cstring>3#include <iostream>4#include <string>5 using namespacestd;6 Const intmaxn=1e4+Ten;7 Const intR= the;8 intDFA[R][MAXN];9 Ten stringText,pat; One voidinit () { A     intm=pat.length (); -dfa[pat[0]][0] =1; -      for(intx=0, j=1; j<m;j++){ the         /** Copy directly from dfa[][x] to Dfa[][j]*/ -          for(intC=0; c<r;c++){ -DFA[C][J] =Dfa[c][x]; -         } +         /** Match to, continue to the right.*/ -DFA[PAT[J]][J] = j+1; +X =Dfa[pat[j]][x]; A     } at  - } - intSearch1 () { - init (); -     intI,j,n = Text.length (), M =pat.length (); -      for(i=0, j=0; I<n && j<m;i++){ inj =Dfa[text[i]][j]; -     } to     returnJ==m? (i-m):-1; + } - intMain () { theFreopen ("Datain.txt","R", stdin); *      while(cin>>text>>Pat) { $Cout<<search1 () <<Endl;Panax Notoginseng     } -     return 0; the}

Reference:

"1" Algorithms (4th)-She Luyun

"2" Http://baike.baidu.com/link?url=_WLufLz1lw2e4eMgU6DI8IblUkp838Qf595Nqxfg2JN3aqNED2FFe3U6J9yPmUv_zKfFqAAQJid7Gzho3ork8K

Classic KMP algorithm C + + and Java implementation code

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.