And knowledge in array cyclic shift

Source: Internet
Author: User

String processing problems are frequently encountered in both ACM and project development. string processing problems often use many interesting classicAlgorithmI personally think that the string processing skills can tell the encoding ability, algorithm base, and thinking ability of a graduating college student. Recently I encountered two character string processing problems: (1) character string movement and inclusion (2) string similarity calculation.

1. character string shifting and Inclusion Problems

Problem definition:Given two strings str1 and str2, the so-called shift inclusion means that if str2 belongs to the Child string of the new string after str1 is displaced, the str1 shift includes str2, so what is the new string produced by the str1 shift? For example, if the new string that can be obtained by shifting one bit from ABCD is BCDA, this shift method is actually cyclic left shift. If str2 = "da", then it can be said that the str1 shift contains str2.

Analyze the problem:We can enumerate the new str_tmp string produced by the cyclic shift of str1. If str2 is a substring of str_tmp, it will satisfy the problem of the shift inclusion, but this idea is not efficient, the time complexity of the entire algorithm includes the time complexity of the cyclic shift and the complexity of determining whether it is a substring (for details about the cyclic shift algorithm, refer:Knowledge in array cyclic shiftKnowledge in array cyclic shift). This is obviously not our solution. In fact, no matter how many digits of the string are moved cyclically, the final new str_tmp string is obtained through two str1 connections. For example, if ABCD is shifted cyclically, We can get: BCDA, cdab, dabc, and ABCD. In fact, the four new strings after the str1 cyclic shift are all substrings of str1str1 = "abcdabcd. Now, you understand, we can save the complexity of the cyclic shift and increase the time complexity of the entire algorithm.

Now we can focus on determining whether str2 is an algorithm for str1str1 substrings. In fact, in the header file <string. h> the strstr (str1, str2) function has been provided to complete the functions we need.CodeYes:

View code

 1   Char * Strstr (Const   Char * S1, Const   Char * S2)  2   {  3       If (! (Len2 = Strlen (S2 )))  4           Return ( Char * ) S1;  5       For (; * S1; ++ S1)  6   {  7           If (* S1 = * S2 & strncmp (S1, S2, len2) = 0  )  8               Return ( Char * ) S1;  9   }  10       Return NULL;  11 }

It can be seen that the method used by this function is not highly efficient. If you want to judge a large number of str2 substrings, or the length of str1 and str2 is relatively long, this algorithm is not very good. We need to think about other algorithms. In fact, we can naturally think of the fast pattern matching algorithm (KMP, this is an algorithm that is basically introduced in every data structure book. For details, refer to Liu Da's data structure version 2nd or another article by myself:Fast pattern matching algorithm (KMP).I believe that writing the complete code through these materials is not a problem, so I will not write it here.

However, we may find that str1str1 takes O (n) as the auxiliary space when connected. n is the length of str1, which actually exists, but we can not merge in the true sense. This is just an idea. In actual operations, for example, to access str1str1 [N + 3], we can allow the actual access to str1 [3]. In this way, we do not need to increase the auxiliary space of O (n.

 

2. Calculate string Similarity

Problem definition:There are two strings str1 and str2. to make these two characters the same, you can perform the following operations: 1. modify a character 2. Add a character 3. delete a character. For example, if str1 = ABC and str2 = AB, you can add or delete a C character to make them the same. For example, str1 = ABC, str2 = Abd, you can modify C-> D or D-> C to make the two strings the same. The two simple examples show that all the operations required are 1. The distance is 1. In fact, the similarity = the reciprocal of the distance + 1. So how can we calculate the similarity between the given two strings?

Analyze the problem:In fact, if the first letter of str1 and str2 is the same, you only need to compare str1 [1 ~ Len1-1 and str2 [1 ~ Len2-1], does not affect the change of distance. If the first letter of str1 and str2 is not the same, one of the following six operations is involved:

1. Delete the first character of str1 and continue to calculate str1 [1 ~ Len1-1 & str2 [0 ~ The distance between len2-1.

2. Delete the first character of str2 and continue to calculate str1 [0 ~ Len1-1 and str2 [1 ~ The distance between len2-1.

3. Modify the first character of str1 to be the first character of str2, and continue to calculate str1 [1 ~ Len1-1 and str2 [1 ~ The distance between len2-1.

4. Modify the first character of str2 to be equal to the first character of str1, and continue to calculate str1 [1 ~ Len1-1 and str2 [1 ~ The distance between len2-1.

5. Add a character at the beginning of str1 to make it equal to the first character of str2, and continue to calculate str1 [0 ~ Len1-1 and str2 [1 ~ The distance between len2-1.

6. Add a character at the beginning of str2 so that it is equal to the first character of str1, and continue to calculate str1 [1 ~ Len1-1 & str2 [0 ~ The distance between len2-1.

We found that no matter which change method is described above, there are only three cases for further calculation: (1) Calculate str1 [1 ~ Len1-1 and str2 [1 ~ Distance between len2-1] D1 (2) Calculate str1 [1 ~ Len1-1 & str2 [0 ~ Distance between len2-1] D2 (3) Calculate str1 [0 ~ Len1-1 and str2 [1 ~ Len2-1] Between D3, and str1 [0 ~ Len1-1 & str2 [0 ~ The distance between len2-1] is equal to min (D1, D2, D3) + 1, and 1 is added because an operation is performed. Obviously, this is a recursion.Program:

Code:

Calstringdistance

 1 # Include <iostream> 2   3   Using   Namespace  STD;  4   5   Int Minvalue ( Int D1, Int D2, Int  D3)  6  {  7       Int Min = D1;  8       If (D2 < Min)  9   {  10 Min = D1;  11   }  12       If (D3 < Min) 13   {  14 Min = D3;  15   }  16       Return  Min;  17   }  18   19   Int Calstringdistance ( String Stra, Int Pabegin,Int Paend, String Strb, Int Pbbegin, Int  Pbend)  20   {  21       If (Pabegin> Paend)  22   {  23           If (Pbbegin> Pbend)  24  {  25               Return   0  ;  26   }  27           Else  28   {  29               Return Pbend-pbbegin + 1  ;  30   } 31   }  32       If (Pbbegin> Pbend)  33   {  34           If (Pabegin> Paend)  35   {  36               Return   0  ;  37  }  38           Else  39   {  40               Return Paend-pabegin + 1  ;  41   }  42   }  43       If (Stra [pabegin] = strb [pbbegin]) // The current letter is the same and the distance does not need to be greater than 1  44   {  45           Return Calstringdistance (stra, pabegin + 1 , Paend, strb, pbbegin + 1  , Pbend );  46   }  47       Else  48   {  49           // D1, D2, and D3 store the distance obtained from three recursive conditions.  50           Int D1 = calstringdistance (stra, pabegin + 1  , Paend, strb, pbbegin, pbend );  51           Int D2 = calstringdistance (stra, pabegin, paend, strb, pbbegin + 1  , Pbend );  52           Int D3 = calstringdistance (stra, pabegin + 1 , Paend, strb, pbbegin + 1 , Pbend );  53           Return Minvalue (D1, D2, D3) + 1  ;  54   }  55   }  56   57   Int  Main ()  58   {  59       String Stra ="  Abe  "  ;  60       String Strb = "  ABCD  "  ;  61 Cout <calstringdistance (stra, 0 , 2 , Strb, 0 , 2 ) <Endl;  62       Return   0  ;  63   }  64      

Summary:In fact, a lot of repeated calculations are carried out in the recursive process. For example, if we want to perform calstringdistance (stra, strb,), then the calstringdistance (stra, strb,) and calstringdistance (stra, strb,), it is inevitable to call calstringdistance (stra, strb,) when executing calstringdistance (stra, strb,), and with the length of stra and strb increasing, there will be a lot of repetition, and thus the algorithm time complexity is not high. So how can we avoid this type of repeated computing ??

1. using the memorandum, we can use a two-dimensional array int flag [] [] to record whether the current calstringdistance (stra, pabegin, paend, strb, pbbegin, pbend) has been called, first, initialize the array to-1. When the first call of the function calstringdistance (string stra, strb,), assign the flag [2] [2] to the function compute, the previous one represents pabegin, and the last one is 2.Pbbegin. So the next time you call calstringdistance (string stra, strb,), you will find that the flag [2] [2] is not equal to-1, and you will know calstringdistance (string stra, strb, 2. 2) after calculation, you can obtain the function result directly from flag [2] [2. The modified code is similar to the original one !!However, we also found that this method introduces the space complexity of O (len1 * len2), but with the length of stra and strb increasing, if you do not do this, the efficiency will be greatly affected.

2. you can also use the recursive method. Now that you know the recursive relationship, recursive can solve the problem. First, initialize Cal [2] [2], cal [1] [2], Cal [2] [1], the subscript of the Two-dimensional array Cal also corresponds to pabegin and pbbegin, and then calculates calstringdistance (string stra, 1, 2, strb, 1, 2). Actually, when calculating Cal [1] [1], we can use Cal [2] [2], Cal [1] [2], cal [2] [1] is obtained, so that continuous recursion can produce the final result. This method avoids the inherent disadvantage of recursive Programs.

In fact, the above two methods are typical ideas for dynamic planning. Dynamic Planning is really great and worth learning !!! 

 

If you thinkArticleThe content is useful to you. You may as well move your mouse and tap "recommendation" in the lower right corner. Your encouragement is my motivation ~~~~

If you think there is a problem with the content of the article, you may wish to speak bluntly in the comments. Good articles also require enthusiastic and intelligent participation ~~~~

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.