Generic KMP Algorithm

Source: Internet
Author: User

When we need to find a mode string from a string Main string), The KMP algorithm can greatly improve the efficiency. KMP is an efficient string matching algorithm that cleverly eliminates the issue of pointer backtracking during the matching process. For more information about KMP Algorithms, see here.

The original KMP algorithm applies to string matching searches. In fact, the search for substrings of any type is actually an array. The KMP algorithm can be used. For example, we may need to search for a specific byte array in byte [], which can also use the KMP algorithm to improve matching performance. To this end, I implemented the generic KMP algorithm so that it can be applied to string matching of any type. The following describes the complete implementation of the algorithm.

/// <Summary> /// generic KMP algorithm. /// Zhuweisky 2013.06.06 /// </summary> public static class GenericKMP {// <summary> /// Next function. /// </Summary> /// <param name = "pattern"> mode string </param> /// <returns> backtracking function </returns> public static int [] Next <T> (T [] pattern) where T: IEquatable <T> {int [] nextFunction = new int [pattern. length]; nextFunction [0] =-1; if (pattern. length <2) {return nextFunction;} nextFunction [1] = 0; int computingIndex = 2; int tempIndex = 0; while (computingIndex <pattern. length) {if (pattern [computingIndex- 1]. equals (pattern [tempIndex]) {nextFunction [computingIndex ++] = ++ tempIndex;} else {tempIndex = nextFunction [tempIndex]; if (tempIndex =-1) {nextFunction [computingIndex ++] =++ tempIndex ;}}return nextFunction ;} /// <summary> /// KMP computing /// </summary> /// <param name = "source"> main string </param> /// <param name = "pattern"> pattern string </param> // the index of the first element matched by <returns>. -1 indicates no match </returns> public static int ExecuteKMP <T> (T [] source, T [] pattern) where T: IEquatable <T> {int [] next = Next (pattern); return ExecuteKMP (source, 0, source. length, pattern, next );} /// <summary> /// KMP computing /// </summary> /// <param name = "source"> main string </param> /// <param name = "sourceOffset"> Start offset of the primary string </param> // <param name = "sourceCount"> Number of elements of the queried primary string </param> /// <param name = "pattern"> mode string </param> /// <Returns> match the index of the first element. -1 indicates no match </returns> public static int ExecuteKMP <T> (T [] source, int sourceOffset, int sourceCount, T [] pattern) where T: IEquatable <T> {int [] next = Next (pattern); return ExecuteKMP (source, sourceOffset, sourceCount, pattern, next );} /// <summary> /// KMP computing /// </summary> /// <param name = "source"> main string </param> /// <param name = "pattern"> pattern string </param> /// <param name = "next"> backtracking function </param> /// <returns> matches The index of the first element. -1 indicates no match </returns> public static int ExecuteKMP <T> (T [] source, T [] pattern, int [] next) where T: IEquatable <T> {return ExecuteKMP (source, 0, source. length, pattern, next );} /// <summary> /// KMP computing /// </summary> /// <param name = "source"> main string </param> /// <param name = "sourceOffset"> Start offset of the primary string </param> // <param name = "sourceCount"> Number of elements of the queried primary string </param> /// <param name = "pattern"> mode string </param> /// <param name = "Next"> the Backtracking function </param> // <returns> matches the index of the first element. -1 indicates no match </returns> public static int ExecuteKMP <T> (T [] source, int sourceOffset, int sourceCount, T [] pattern, int [] next) where T: IEquatable <T> {int sourceIndex = sourceOffset; int patternIndex = 0; while (patternIndex <pattern. length & sourceIndex <sourceOffset + sourceCount) {if (source [sourceIndex]. equals (pattern [patternIndex]) {sourceIndex ++; patternIndex ++;} else {patternIndex = ne Xt [patternIndex]; if (patternIndex =-1) {sourceIndex ++; patternIndex ++ ;}} return patternIndex <pattern. Length? -1: sourceIndex-patternIndex ;}}


Note:

1) each element in the string must be compared to whether it is equal. Therefore, generic T must implement the IEquatable interface.

2) The Next function is exposed to public to cache backtracing functions externally for multiple times. Because, we may often search for the same mode string in different main strings.

3) If you want to apply GenericKMP to string matching search, you can first convert the string to a character array and then call the GenericKMP algorithm. As shown below:

String source = "..............";

String pattern = "*****";

Int index = GenericKMP. ExecuteKMP <char> (source. ToCharArray (), pattern. ToCharArray ());


Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.