The Manacher algorithm is an algorithm for solving longest palindromic substring (the longest palindrome substring) with time and space complexity of O (n). Palindrome string is a central symmetric string, such as ' ABCBA ', ' ABCCBA '. Then the longest palindrome substring is, by definition, the longest palindrome in a substring in a sequence. At the end of this paper, using Python to realize the algorithm, in order to facilitate understanding, the mathematical formula appearing in this paper is also using py notation.
After this problem was done with the algorithm of Time Complexity O (n**2) and Space complexity O (1) on Leetcode, a search was made to find an O (n) algorithm. Unfortunately, the English Wikipedia on the description is too abstract, the introduction of Chinese and did not find said very clear, so determined to write a Chinese relatively clear. I figured out that this algorithm is through an article on Leetcode, the first external link in the Wikipedia entry. Link here (http://articles.leetcode.com/2011/11/longest-palindromic-substring-part-ii.html), illustrated, very easy to understand (I just did not read through the article, the main look at the figure and figure of the description to understand). If you still feel the difficulty of reading English, then read my article.
The ultimate goal of the Manacher algorithm is to construct a new queue based on the original string, which is centered on the point and the longest symmetric length. In order to solve the problem of the odd parity (such as ABA and ABBA, the conventional algorithm needs to be divided into two cases), the first is to construct a secondary string to insert an identical character between the beginning and end and any two characters. such as String Ababa, constructed into #a #b#a#b#a#. The next step is to construct a new queue that records the longest symmetric length centered on that point:
S1 = Ababa
5 0 3 0 1 0
S2 = Abaaba
T2 = # a # b # a # a # b # # a #
6 1 0 3 0 1 0
So, how do you construct the sequence P? First, we can always get the value of the first two elements of P by removing the corner case with a string length of less than 1.
T = #? # ... P = 0 1? ...
Then, we can find the next value in the order of the P elements we already know and the elements in T, step by step. Problem broken down into: known p[:i], beg P[i].
Next, we will discuss the nature of the palindrome to make P. The following is the example in the Leetcode article, s = ' BABCBABCBACCBA ' (len (s) = = 14,t = ' #b #a#b#c#b#a#b#c#b#a#c#c#b#a# ', len (t) = = Len (s) *2+1 = = 29).
1. Core algorithms
For example, suppose that when we know T, P[:8], ask P[9].
0 1 2 3 4 5 6 7 890 1 2 3 4 5 6 7 8 90 1 2 3 4 5 6 7 8
C ? ...
Observe the part that I enclose in square brackets, it is to t[7] as the center, 7 is the length of the palindrome. Intuitively, because palindrome is a symmetric structure, the element value in P also seems to be based on the center symmetry, then p[9] = p[5] = 1, from the result is also correct. Then fill it out, and soon you'll find that the conclusion is wrong.
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8
T = {# b [# a # b #ca # B #] C # b # a # C # C # b # a #P = 0 1 3 0 1 0 7? ...
i = 11
Center = 7
right = 14
Mirror = 3
When filled with p[11], the red word part is t[7] is the center 7 for the length of palindrome, while the yellow background color portion, is p[11] as the center 9 for the length of palindrome. According to the conclusion of the previous paragraph, we should fill in p[7-(11-7)] = p[3] = 3, but should actually fill in 9. What's going on here?
To clarify this problem, first define some variables. First the center is known as the symmetric point, right is the i,i of the known symmetry point, and the current P index is the index of the symmetric point about center is mirror. Above in seeking p[9] and p[11], center = = 7,right = = 14. Next, let's go back and scan and look at another situation: known center==11, Right==20, p[15].
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8
T = {# b [ca # b #}c # b # a #] C # C # b # A #P = 0 1 79 0 1 0 ? ...
i = 15
Center = 11
right = 20
Mirror = 7
Observing these two processes, it is not difficult to find that the reason for this is that mirror is the center of the palindrome (example with a yellow background label, the palindrome between {}), its range exceeds the center-centric point of the Palindrome's left (Red label, [] palindrome between). And all p[i] = = P[mirror] palindrome, its p[mirror] range of not more than P[center] range. Specifically, the left end of P[mirror] is not more than p[center].
So how to judge? Mirror to P[center] The length of the left is right-i, if the length is not less than p[mirror], P[mirror] is within the range of P[center]. If you don't understand it, here's a mathematical deduction from a stupid method:
Palindromeleftof (mirror) = Mirror-p[mirror]
Palindromeleftof (center) = center-(right-center)
Mirror = center-(i-center)
If Leftof (mirror) is inside P
Palindromeleftof Palindromeleftof (mirror) =
Center-(right-center) <= Mirror-p[mirror] =
Center-(right-center) <= Center-(i-center)-p[mirror] =
-right <=-i-p[mirror] =
P[mirror] <= right-i
It should be noted that due to the existence of p[mirror] = = Right-i condition, that is, P[mirror] The left end of P[center], which also means that the current right side of P[i] is right,p[i] may be more than p[mirror] Extended to P[i] in this case.
So far, the process of calculating the results of P[i] with O (1) by known P[:i] is done. This process is the core of the manacher algorithm and the hardest to understand, and the rest of the process is not complicated.
2. The process of extending P
Now back to the beginning of this article, when we get the source data, we can get the first two items of P immediately.
T = # b # a # b # C # b # a # b # C # b # # a # C # C # b # a #
P = 0 1? ...
Center = 1
right = 2
Start calculation, i=2, mirror=0, p[mirror] = = Right-i. According to the conclusion of the previous section, it is necessary to extend the P, that is, to t[i+n] = = T[i-n] To be judged in turn. Unfortunately, it failed the first time, so we went to the next step:
T = # b # a # b # C # b # a # b # C # b # # a # C # C # b # a #
P = 0 1 0? ...
i = 3
Center =?
right = 2
Mirror =?
Whether it is centered on p[1] or p[2], the value of right is always 2, and Next i > right, no matter which is the center, there is no symmetry point for I. In that case, let's start from scratch:
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8
T = # b # a # b # C # b # a # b # C # b # # a # C # C # b # a #
P = 0 1 0 3? ...
i = 4
Center = 3
right = 6
Mirror = 2
This time we got a new center, right, and it's bigger than the original, so replace the old one with the new one.
So far, I > Right and P[mirror] <= right-i Two cases have been discussed, then the last one left.
3. The final step
I <= right and P[mirror] > right-i This kind of thing left. Since we are going to expand P, there seems to be nothing wrong with starting from scratch. Consider the following situation first:
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8
T = # b
# a # b # c # b # a # b # c # b # a # span> C # C # b # a #
P = 0 1 0 3 0 1 0 7 0 1 0 9 0 1 0? ...
i =
Center = one
right = mirror
= 7
This situation has been seen once before P[mirror] > right-i. At this point, we need to expand p[15], but need to start judging from n=0 p[15+n] = = P[15-n] established? Because p[mirror]=7 > right-i=5, because when 0<=n<=7, have p[mirror-n] = = P[mirror+n], and when 0<=n<=5, p[mirror-n] = = P[i+n], the It is not difficult to draw when 0<=n<=5 p[i+n] = = P[i-n]. In other words, because the palindrome range of P[mirror] exceeds the range of p[center], so p[i] must be a palindrome within the range of no more than p[center]. So at this point we don't need to judge from 0, and just start judging from the right+1.
Code
Before the code, first reclassify and merge the above cases.
1. Directly with O (1) operation to get P[i], do not need to expand P, do not need to scan the T,center, right value is not changed.
2. You need to expand P to start scanning from t[right+1], and you need to update the value of center and right.
The complexity of Case 1 is O (1), while the complexity of Case 2 is O (n), but because of the algorithm's implementation, it is guaranteed to scan t from left to right, each time it starts from right+1, and the right side of the scan has been left, so that each element in T is guaranteed to be accessed no more than 2 times. Therefore, the complexity of the manacher algorithm is O (n).
So according to my classification, the implementation of a piece of Python code may be different from the standard implementation you see.
1 #@param {string} s2 #@return {string}3 defLongestpalindrome (s):4 ifLen (s) <= 1:5 returns6 7The_answer = 428T =[The_answer]9 forCinchS:Ten T.append (c) One t.append (the_answer) A -C, r, size = 1, 2, Len (T) -P = [0, 1] + [None] * (size-2) theMaxindex, MaxCount = 0, 1 - forIinchXrange (2, size): -m = c*2-i#Mirror = center-(i-center) - ifR > I andP[m] < RI: + #Case 1, just set P[i] <-P[m] -P[i] =P[m] + Continue A at #Case 2, expand P -Count = Min (i, size-i-1)#n ' s limit - #Scan, from if R <= I then t[i+1] else t[right+1] - forNinchXrange ((1ifR <= IElseR+1-i), count+1): - ifT[i+n]! = t[i-N]: -Count = n-1 in Break - to #Update Center and right, save P[i], compare with the Max +c =I -r = i+Count theP[i] =Count * ifCount >MaxCount: $MaxCount =CountPanax NotoginsengMaxindex = iCount - theMaxindex = Maxindex//2 + returnS[maxindex:maxindex+maxcount]
Manacher ' s algorithm: the longest palindrome substring algorithm