Longest Common subsequence and Python implementation

Source: Internet
Author: User

1. Longest Common subsequence Problem Description

The subsequence of a given sequence is the sequence obtained after several elements are deleted from the sequence. To be exact, if the given sequence X = {x1, x2 ,... xm}, then another sequence, Y = {y1, y2 ..., yn}. When another sequence is the subsequence of X and the subsequence of Y, Z is the public subsequence of X and Y. The longest common subsequence problem is the given sequence X and Y. Find the longest (non-consecutive) of all common subsequences ).

2. Structure of the longest common subsequence

The easiest way to solve the longest common subsequence problem is the exhaustive method, that is, to check whether it is also a subsequence of Y for all subsequences of X, this determines whether it is a public subsequence of X and Y, and records the longest public subsequence during the check. After checking all subsequences of X, we can find the longest common subsequences of X and Y. No subsequences of X correspond to a subset of the subscript set {1, 2... m. Therefore, the public 2 ** m sub-sequences require exponential time.

In fact, the longest common subsequence problem has the optimal sub-structure nature:

Set sequence X = {x1, x2 ,.... xm} and Y = {y1, y2 ..., one of the longest common subsequences of yn} is Z = {z1, z2 ,... zk}, then

(1) If xm = yn, zk = xm = yn and the Zk-1 is the longest common subsequence of the Xm-1 and Yn-1.

(2) If xm = yn and zk = xm, z is the longest common subsequence of Xm-1 and Y.

(3) If xm =yn and zk =yn, Z is the longest common subsequence of X and Yn-1.

Proof omitted

3. recursive structure of subproblems

According to the optimal sub-structure nature of the longest common subsequence problem, we need to find X = {x1, x2 ,.... xm} and Y = {y1, y2... the longest common subsequence of yn} can be recursively performed in the following way: When xm = yn, find the longest common subsequence of Xm-1 and Yn-1, add xm (= yn) at the end of the sequence to obtain the longest common subsequence of X and Y. When xm is less than yn, there must be two problems, that is to find a Longest Common subsequence of Xm-1 and Yn that has been Xm and Yn-1. The longest common subsequences of the two subproblems are obtained. Use c [I] [j] to record the length of the longest common subsequences of Xi and Yj, where Xi = {x1, x2... xi}; Yj = {y1, y2... yj}. When I = 0 or j = 0, the null sequence is the longest common subsequence of Xi and Yj. Therefore, c [I] [j] = 0, recursive relationships can be established based on the optimal sub-structure:

4. Calculate the optimal value.

A recursive algorithm used to calculate c [I] [j] is easy to write by using recursive relationships, but its computing time increases with the input length index. Because of the subproblem space, there are a total of different sub-Problems with Phi (mn). Therefore, using dynamic planning algorithms to calculate the optimal value from bottom to top can improve the efficiency of the algorithm. <喎?http: www.bkjia.com kf ware vc " target="_blank" class="keylink"> VcD4KPHA + records/records + 38zltPrC68q1z9bI58/records = "brush: java;"> def LCSLength (m, n, x = None, y = None, c = None, B = None): for I in range (1, m): for j in range (1, n): if x [I-1] = y [J-1]: c [I] [j] = c [I-1] [J-1] + 1 B [I] [j] = 'equal' elif c [I-1] [j]> = c [i] [J-1]: c [I] [j] = c [I-1] [j] B [I] [j] = 'up' else: c [I] [j] = c [I] [J-1] B [I] [j] = 'left'

5. Construct the longest common subsequence

The array B Calculated by the LCSLength algorithm can be used to quickly construct the longest common subsequences of X = {x1, x2. .. xm} and Y = {y1, y2. .. yn. First, start with B [m] [n] and search in the direction pointed to by B [I] [j] value. Equals indicates equal, that is, the longest common subsequence of X and Y is the longest common subsequence of Xi-1 and Yj-1 plus the longest common subsequence obtained by xi, left is left, that is, the longest common subsequences of Xi and Yj are the same as those of Xi and Yj, that is, the longest common subsequences of Xi and Yj are the same as those of Xi-1 and Yj. The following LCS algorithm is a specific solution process.

def GetLCS(i, j, x=None, b=None):    if i == 0 and j == 0:        return    if b[i][j] == 'equals':        GetLCS(i-1,j-1, x, b)        print x[i-1], ' '    elif b[i][j] == 'left':        GetLCS(i, j-1, x, b)    else:        GetLCS(i-1, j, x, b)

6. instance analysis

There are two sequences X = {A, B, C, B, D, A, B} and Y = {B, D, C, A, B, }, result calculated by the algorithm LCSLength and LCS:

7 code running implementation


#-*-coding:utf8-*-"""Author  : xianghongleeEmail   : xianghongleeking@gmail.comCreated on '14-5-26'"""def LCSLength(m, n, x=None, y=None,c=None, b=None):    for i in range(1, m):        for j in range(1, n):            if x[i-1] == y[j-1]:                c[i][j] = c[i-1][j-1]+1                b[i][j] = 'equals'            elif c[i-1][j] >= c[i][j-1]:                c[i][j] = c[i-1][j]                b[i][j] = 'up'            else:                c[i][j] = c[i][j-1]                b[i][j] = 'left'def GetLCS(i, j, x=None, b=None):    if i == 0 and j == 0:        return    if b[i][j] == 'equals':        GetLCS(i-1,j-1, x, b)        print x[i-1], ' '    elif b[i][j] == 'left':        GetLCS(i, j-1, x, b)    else:        GetLCS(i-1, j, x, b)if __name__ == '__main__':    x ="ABCBDAB"    y = "BDCABA"    m = len(x)+1    n = len(y)+1    b = [[0 for i in range(n)] for j in range(m)]    c = [[0 for i in range(n)] for j in range(m)]    LCSLength(m,n,x,y,c,b)    for i in c:        print i    for j in b:        print j    GetLCS(m-1,n-1,x,b)

The output is as follows:



Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.