Maximum Common Subsequence

Source: Internet
Author: User

Problem description:
Specify two sequences (arrays or strings in C and list in python) and find the largest common subsequences of the two. The relative sequence of elements in the subsequence remains unchanged and is not necessarily continuous. For example, in "abcdef", "abc" and "ace" are counted as subsequences. Of course, it is not difficult to draw a conclusion, a sequence with a length of n with a sub-sequence composition of 2 ^ n (back to arrange the combination)

Recursive solution:
The exponential complexity problem often cannot be solved in one step (it is unacceptable to make a direct effort). Therefore, we should consider whether we can solve its subproblems through a roundabout approach. For the two sequences x, y, whose lengths are n, m, we can find that the LCS results of x and y can be obtained from one of the three sub-problems:
1. LCS (x1. .. n-1, y1. .. m)
2. LCS (x1. .. n, y1. m-1)
3. LCS (x1. .. n-1, y1. m-1) + public tail Element

PYTHON code:

 def lcs_len(x, y):     """This function returns length of longest common sequence of x and y."""        if len(x) == 0 or len(y) == 0:         return 0          xx = x[:-1]   # xx = sequence x without its last element         yy = y[:-1]          if x[-1] == y[-1]:  # if last elements of x and y are equal         return lcs_len(xx, yy) + 1     else:         return max(lcs_len(xx, y), lcs_len(x, yy)) def lcs_len(x, y):    """This function returns length of longest common sequence of x and y."""      if len(x) == 0 or len(y) == 0:        return 0       xx = x[:-1]   # xx = sequence x without its last element       yy = y[:-1]       if x[-1] == y[-1]:  # if last elements of x and y are equal        return lcs_len(xx, yy) + 1    else:        return max(lcs_len(xx, y), lcs_len(x, yy))


Dynamic Planning solution O (n ^ 2 ):
Apparently, recursive operations introduce many repeated computations. Dynamic Planning can solve this problem. One of its English explanations is very good: whenever the results of subproblems are needed, they have already been computed, and can simply be looked up in a table. That is, the calculation of all sub-problems can be completed by the table! Let's take a look at the Code:

 

 def lcs(x, y):     n = len(x)     m = len(y)     table = dict()  # a hashtable, but we'll use it as a 2D array here          for i in range(n+1):     # i=0,1,...,n         for j in range(m+1):  # j=0,1,...,m             if i == 0 or j == 0:                 table[i, j] = 0             elif x[i-1] == y[j-1]:                 table[i, j] = table[i-1, j-1] + 1             else:                 table[i, j] = max(table[i-1, j], table[i, j-1])                                  # Now, table[n, m] is the length of LCS of x and y.                                 # Let's go one step further and reconstruct                 # the actual sequence from DP table:                      def recon(i, j):         if i == 0 or j == 0:             return ""         elif x[i-1] == y[j-1]:             return recon(i-1, j-1) + str(x[i-1])         elif table[i-1, j] > table[i, j-1]:              return recon(i-1, j)         else:             return recon(i, j-1)              return recon(n, m) def lcs(x, y):    n = len(x)    m = len(y)    table = dict()  # a hashtable, but we'll use it as a 2D array here       for i in range(n+1):     # i=0,1,...,n        for j in range(m+1):  # j=0,1,...,m            if i == 0 or j == 0:                table[i, j] = 0            elif x[i-1] == y[j-1]:                table[i, j] = table[i-1, j-1] + 1            else:                table[i, j] = max(table[i-1, j], table[i, j-1])                               # Now, table[n, m] is the length of LCS of x and y.                               # Let's go one step further and reconstruct                # the actual sequence from DP table:                   def recon(i, j):        if i == 0 or j == 0:            return ""        elif x[i-1] == y[j-1]:            return recon(i-1, j-1) + str(x[i-1])        elif table[i-1, j] > table[i, j-1]:            return recon(i-1, j)        else:            return recon(i, j-1)           return recon(n, m)

A 2D table is used in the code, and table (I, j) represents the LCS_LEN Of The subproblem (I, j). After analysis, its value may only be a table (I-1, j-1), table (I, J-1), table (I-1, j) One, so from top to bottom, from left to right assignment does not appear table (I, j) the assignment is not possible. Of course, obtaining LCS_LEN is not our ultimate goal, especially in applications. Generally, we need to obtain this LCS, so we can obtain the result through table (see the code ).
Dynamic Planning solution-optimized O (NlogN ):


I checked some information and it seems that there is a nlogn solution, but the input must meet certain conditions. Because it is not universal, I think it is of little significance to study it, let's analyze it again.

 

 

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.