String Matching Algorithm-Sunday Algorithm

Source: Internet
Author: User

Among the string matching search algorithms, the most famous two are the KMP algorithm (Knuth-Morris-Pratt) and the BM algorithm (Boyer-Moore ). In the worst case, both algorithms have linear search time. However, in practice, the KMP algorithm is not much faster than the simplest C-database function strstr (), while the BM algorithm is usually 3-5 times faster than the KMP algorithm (not practical ). But the BM algorithm is not the fastest algorithm. Here we introduce a Sunday algorithm that is faster than the BM algorithm. The idea of the Sunday algorithm is very similar to that of the bad characters in the BM algorithm. The difference is that after the Sunday algorithm fails to match, it uses the characters at the next position of the current part of the target string corresponding to the Pattern string for bad character matching. When the matching fails, you can determine whether the current offset of the parent string, the length of the Pattern string, and the position of the character (assuming K) exist in the Pattern string. If yes, the position is aligned with the character in the Pattern string and then matched from the beginning. If no, the Pattern string is moved backward, it is aligned with the character at k + 1 of the parent string and then matched. Repeat the preceding operation until it is found or the parent string is found. I wrote a small example to implement the following algorithm. In the Code, two string matching algorithms are implemented. One is Sunday, and the other is normal one-bit moving. The efficiency comparison between the two is in the main function, all are in nanoseconds. The detailed steps of the algorithm have been added to the Code. For the BM algorithm, we will compare it with the analysis when we leave it empty next time. 1 import java. util. hashMap; 2 import java. util. using list; 3 import java. util. list; 4 import java. util. map; 5 6/** 7 * @ author Scott 8 * @ date December 28, 2013 9 * @ description 10 */11 public class SundySearch {12 String text = null; 13 String pattern = null; 14 int currentPos = 0; the first character position List of the matched substring after 15 16/** 17 * is 18 */19 List <Integer> matchedPosList = new Character List <Integer> (); 20 21/** 22 * M matching characters Ap, record the char that matches the string and the displacement of each char is 23 */24 Map <Character, Integer> map = new HashMap <Character, Integer> (); 25 26 public SundySearch (String text, String pattern) {27 this. text = text; 28 this. pattern = pattern; 29 this. when initMap (); 30}; 31 32/** 33 * Sunday match, it is used to store the last occurrence position of each character in Pattern, the order from left to right is 34 */35 private void initMap () {36 for (int I = 0; I <pattern. length (); I ++) {37 this. map. put (patter N. charAt (I), I); 38 39} 40} 41 42/** 43 * recursive matching of common strings, if the match fails, a 44 */45 public List <Integer> normalMatch () {46 // The match fails. continue to the next step. if (! MatchFromSpecialPos (currentPos) {48 currentPos + = 1; 49 50 if (text. length ()-currentPos) <pattern. length () {51 return matchedPosList; 52} 53 normalMatch (); 54} else {55 // match successful, record position 56 matchedPosList. add (currentPos); 57 currentPos + = 1; 58 normalMatch (); 59} 60 61 return matchedPosList; 62} 63 64/** 65 * Sunday match, assume that the K character position in Text is: Current Offset + Pattern String Length + 1 66 */67 public List <Integer> Sunday Match () {68 // 69 if (! MatchFromSpecialPos (currentPos) {70 // if the K character in Text is not in the Pattern string, skip the entire Pattern string length 71 if (currentPos + pattern. length () + 1) <text. length () 72 &&! Map. containsKey (text. charAt (currentPos + pattern. length () + 1) {73 currentPos + = pattern. length (); 74} else {75 // if K characters in Text appear in the Pattern string, the position of the K character in Text is aligned with the position of the last K character in the Pattern string 76 if (currentPos + pattern. length () + 1)> text. length () {77 currentPos + = 1; 78} else {79 currentPos + = pattern. length ()-(Integer) map. get (text. charAt (currentPos + pattern. length (); 80} 81} 82 83 // The matching is complete. The initial displacement 84 if (text. length ()-currentPos) <pattern. length () {85 return matchedPosList; 86} 87 88 sundayMatch (); 89} else {90 // match the successful one and then match 91 matchedPosList again. add (currentPos); 92 currentPos + = 1; 93 sundayMatch (); 94} 95 return matchedPosList; 96} 97 98/** 99 * check whether the substring starting from the specified offset of Text matches Pattern 100 */101 public boolean matchFromSpecialPos (int pos) {102 if (text. length ()-pos) <pattern. length () {103 return false; 104} 105 106 for (int I = 0; I <pattern. length (); I ++) {107 if (text. charAt (pos + I) = pattern. charAt (I) {108 if (I = (pattern. length ()-1) {109 return true; 110} 111 continue; 112} else {113 break; 114} 115 116 return false; 118} 119 120 public static void main (String [] args) {121 SundySearch sundySearch = new SundySearch ("hello, adfsadfklf adf234masdfsdfdsfdsfdsffwerwrewrerwerwersdf2666sdflsdfk", "adf "); 122 123 long begin = System. nanoTime (); 124 System. out. println ("NormalMatch:" + sundySearch. normalMatch (); 125 System. out. println ("NormalMatch:" + (System. nanoTime ()-begin); 126 127 begin = System. nanoTime (); 128 System. out. println ("SundayMatch:" + sundySearch. sundayMatch (); 129 System. out. println ("SundayMatch:" + (System. nanoTime ()-begin); 130 131} 132} running result: NormalMatch: [13, 17, 24] NormalMatch: 313423 SundayMatch: [13, 17, 24] SundayMatch: 36251

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.