Python to find the maximum length of a string

Given a string, find its longest length of the substring. For example, if the input string is '000000' and its longest substring is '000000', 4 is returned.

The easiest way to think of is to enumerate all the substrings, and then judge whether the strings are input strings one by one, and return the longest length of the input substrings. I don't need to say that the duration of enumeration implementation is intolerable. Is there an efficient way to search for the response strings? The answer is of course yes, that is, the center extension method. Select an element as the center, and then search for the largest echo substring centered on the element. However, a new problem occurs. The length of the substring may be the base number or even number. For the substring with an even length, there is no central element. Is there a way to classify parity-length substrings into one category and use the central Extension Method in a unified manner? It is the manacher algorithm. It inserts special characters into the original string. For example, after inserting #, the original string becomes '#3 #5 #5 #3 #4 #3 #2 #1 #'. Now we can use the center extension for the new string. The radius obtained by the center extension method is the length of the substring.

Now the implementation idea has been clarified, first convert the string '20160301' ----> '#3 #5 #5 #3 #4 #3 #2 #1 #', then obtain the length of the longest echo substring centered on each element. The Python implementation is as follows:

#! /Usr/bin/python

#-*-Coding: UTF-8 -*-

Def max_substr (string ):

S_list = [s for s in string]

String = '#' + '#'. join (s_list) + '#'

Max_length = 0

Length = len (string)

For index in range (0, length ):

R_length = get_length (string, index)

If max_length <r_length:

Max_length = r_length

Return max_length

Def get_length (string, index ):

# Obtain the longest return string centered on index cyclically

Length = 0

R _ = len (string)

For I in range (1, index + 1 ):

If index + I <r _ and string [index-I] = string [index + I]:

Length + = 1

Else:

Break

Return length

If _ name _ = "_ main __":

Result = max_substr( "35534321 ")

Print result

The function has been implemented and no bugs have been tested. But let's calm down and think about whether there is room for optimization for the current solution? According to the current solution, we have obtained the maximum echo substring of each element center in '123. When traversing to '4', we know that max_length is 4 for the longest response substring, which is 3 for the longest response substring centered on 4, it is smaller than max_length, so we do not update max_length. In other words, it is useless to calculate the maximum length of the 4-centered input string. This is what we want to optimize. Since the maximum length of a substring of an element does not exceed max_length, we do not need to calculate its longest substring, when traversing a new element, we should first judge whether the length of the string centered on it can surpass max_length. If it cannot exceed, we will continue to traverse the next element. The following are the optimized implementations:

#! /Usr/bin/python

#-*-Coding: UTF-8 -*-

Def max_substr (string ):

S_list = [s for s in string]

String = '#' + '#'. join (s_list) + '#'

Max_length = 0

Length = len (string)

For index in range (0, length ):

R_length = get_leng2( string, index, max_length)

If max_length <r_length:

Max_length = r_length

Return max_length

Def get_leng2( string, index, max_length ):

# Obtain the longest String Based on the longest known string

#1. Center + maximum radius beyond the string range, return

R _ = len (string)

If index + max_length> r _:

Return max_length

#2. cannot go beyond the maximum radius, return

Rochelle string = string [index-max_length + 1: index + 1]

R_string = string [index: index + max_length]

If l_string! = R_string [:-1]:

Return max_length

#3. Calculate the new maximum radius

Result = max_length

For I in range (max_length, r _):

If index-I> = 0 and index + I <r _ and string [index-I] = string [index + I]:

Result + = 1

Else:

Break

Return result-1

If _ name _ = "_ main __":

Result = max_substr( "35534321 ")

Print result

So how much is the speed improved? Taking the string 1000 '1' as an example, the execution time of the algorithm before optimization is 0.239018201828, the optimization is 0.0180191993713, and the speed is increased by about 10 times.

/Usr/bin/python/Users/hakuippei/PycharmProjects/untitled/the_method_of_programming.py

0.239018201828

0.0180191993713