Longest palindrome substring (longest palindromic Substring) part II

Source: Internet
Author: User

[Translate + change] longest palindrome (longest palindromic Substring) part II

Problem: Given the string s, find the longest palindrome substring in S.

In the previous article, we gave 4 algorithms, including an O (N2) time O (1) Space algorithm (center detection method), has been very good. This article discusses an O (n) time O (n) space algorithm, known as the Manacher algorithm , and details how its time complexity is O (n).

Hint +bit Wei Wei + quietly left a message in this version of the right to say:

First think of a way to improve the central detection method.

Consider the worst case scenario. ★

The worst case scenario is when the palindrome is overlapping each other. such as "Aaaaaaaaaa" and "CABCBABCBABCBA".

Why is it the worst case when there is overlap? Because duplicate calculations occur. ★ (In other words, there is no overlap, there must be a 1.1-point calculation, there is no room to improve.) )

Spend some space to avoid repeating calculations. ★

Use the characteristics of palindrome to avoid repeated calculation. ★

An O (N) algorithm (manacher) +bit Wei Wei + quietly left a message in this version of the right to say:

First we transform the string s into T, and the transformation is to insert a "#" between each character in S and the end of S. You will soon know the reason for doing so.

For example, s= "Abaaba", then t= "#a #b#a#a#b#a#".

Think about it, you have to expand around the center of TI to determine the TI-centric palindrome length d in the end is how much. (that is to say, this step is unavoidable)

In order to improve the worst case, we store the palindrome radius of each ti into the array p, using p[i] to represent the length of the palindrome centered on Ti. So when we find out all the p[i], the longest palindrome string can be found by taking the maximum value.

For the example above, let's write all the P studies directly.

i = 0 1 2 3 4 5 6 7 8 9 A B C

T = # a # b # a # a # b # # a #

P = 0 1 0 3 0 1 6 1 0 3 0 1 0

+bit Wei Wei + quietly left a version of the right to say:

Apparently the oldest string is the "Abaaba" centered on p[6.

Have you found that a palindrome of odd and even lengths can be handled gracefully after inserting "#"? That's what it's for.

Now, imagine you're drawing a vertical line in the center of "Abaaba", do you notice that the array p is centered around this vertical line? Try the center of "ABA" again, and p is symmetrical around this center. This is certainly not a coincidence, but an inevitable rule under certain conditions. We will use this law to reduce the repetition of some elements in the logarithm group p.

Let's look at an example that overlaps more typically, s= "BABCBABCBACCBA".

Shows the appearance of converting s to T. Suppose you've worked out a portion of P. The vertical solid line represents the center C of the Palindrome "ABCBABCBA", and the two actual and false lines represent their left and right boundaries L and R. Your next step is to calculate that the symmetry point around C is p[i],i. Do you have a way to efficiently calculate p[i]?

Let's take a look at the symmetry point I ' (i ' =9) around C.

It is evident that P[i]=p[i ']=1. This is because I and I are symmetrical around C. In the same vein, p[12]=p[10]=0,p[14]=p[8]=0.

Now look at i=15. At this time p[15]=p[7]=7? Wrong, you will find a character by the time p[15] should be 5.

Why is the rule changing at this point?

As shown, the range of the two green solid lines must be symmetrical, and the range delineated by the two green dashed lines must also be symmetrical. At this point please note p[i ']=7, which exceeds the left boundary L. The excess part is asymmetrical. At this time we only know p[i]>=5, as for P[i] can also expand, only by character detection can be determined.

In this example, p[21]≠p[9], so p[i]=p[15]=5.

We summarize the above analysis process, is the key part of this algorithm.

if p[i '] < R–i,

Then p[i]←p[i ']

Else p[i]≥r-i. (At this point through R-character to determine p[i]).

(Note: The original author's wording is logically wrong, I made a correction)

Isn't it elegant? If you can understand this, you've taken care of the most difficult and most essential part of the algorithm.

Obviously the position of C is also needed to move, this is easy:

If the palindrome in I is more than R, then C=i, and the corresponding change L and R can be.

+bit Wei Wei + quietly left a version of the right to say:

Every time you ask for p[i], there are two possibilities. If P[i '] < R–i, we will p[i] = P[i ']. Otherwise, start with R character by word Fu Tiu p[i] and update C and its R. At this point, the extended R (Word Fu Tiu p[i]) uses up to n steps, and each C also requires a total of n steps. So the time complexity is 2*n, i.e. O (N).

(Note: The original author calculates the time complexity of the sentence I did not understand.) I'll find a way to understand it myself.

In the figure I is the index, T is the string after adding "#", "^" and "$", P[i] is the number of times that the algorithm p[i],calc[i] is required to perform a comparison in order to find out p[i].

"V" indicates that the character of this column is compared to the character on the left, and "X" corresponds to the left side. The green indicates that the comparison result is the same as two characters (that is, the comparison result is successful), and the red representation is different (that is, the comparison result is a failure).

It is clear that the number of "X" and "V" is equal.

As you can see, the number of successful comparisons required (green "V", which shows horizontal growth) does not exceed N, the number of failures (red "V", which is shown as vertical growth) is not more than N, so the time complexity of this algorithm is 2N, i.e. O (N).

The original author's program is not easy to understand, I put on my code.

1  Public classSolution {2     //Transform S into T.3     //for example, S = "ABBA", T = "^ #a #b#b#a#$".4     //^ and $ signs is sentinels appended to the end to avoid bounds checking5 string Preprocess (string s) {6         intn =s.length ();7         if(n = = 0)return"^$";8 9String ret = "^";Ten          for(inti = 0; I < n; i++) One         { ARET + = "#" + s.substring (i, i + 1); -         } -          theRET + = "#$"; -         returnret; -     } -      Publicstring Longestpalindrome (string s) { +String T =preprocess (s); -         intLength =t.length (); +         int[] p =New int[length]; A         intC = 0, R = 0; at          -          for(inti = 1; i < length-1; i++) -         { -             intI_mirror = C-(I-C); -             intdiff = R-i; -             if(diff >= 0)//at present I between C and R, can take advantage of the symmetric properties of Palindrome in             { -                 if(P[i_mirror] < diff)//The palindrome length of the symmetric point of I is inside the large palindrome range of C to{P[i] =p[i_mirror];} +                 Else -                 { theP[i] =diff; *                     //the palindrome at I may be out of the large palindrome range of C . $                      while(T.charat (i + p[i] + 1) = = T.charat (I-p[i]-1))Panax Notoginseng{p[i]++; } -C =i; theR = i +P[i]; +                 } A             } the             Else +             { -P[i] = 0; $                  while(T.charat (i + p[i] + 1) = = T.charat (I-p[i]-1)) ${p[i]++; } -C =i; -R = i +P[i]; the             } -         }Wuyi  the         intMaxLen = 0; -         intCenterindex = 0; Wu          for(inti = 1; i < length-1; i++) { -             if(P[i] >maxlen) { AboutMaxLen =P[i]; $Centerindex =i; -             } -         } -         returnS.substring ((CenterIndex-1-MaxLen)/2, (CenterIndex-1-MaxLen)/2 +maxlen);  A     } +}
Manacher ' s

Note +bit Wei Wei + quietly left a message in this version of the right to say:

This algorithm is non-trivial, no one will ask you to give such domineering things during the interview. However, if you can read here and understand here, it is worth giving yourself a big reward!

Look farther.

There is actually a sixth workaround: the suffix tree (suffix). However, its complexity is O (n log n), the construction of the suffix tree is also more laborious, the implementation of the algorithm is more complex than this. Of course, it also has its advantages: it can solve many similar problems. We tell.

You can consider: how to find the longest palindrome subsequence (subsequence)?

+bit Wei Wei + quietly left a version of the right to say:

Longest palindrome substring (longest palindromic Substring) part II

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.