LCS (longest common sub-sequence) and DP (Dynamic planning)

Source: Internet
Author: User

Reference: V_july_v

the longest common sub-sequence definition:

Note the difference between the longest common substring (longest commonsubstring) and the longest common subsequence (Longestcommon subsequence, LCS): a substring (Substring) is a contiguous part of a string, A subsequence (subsequence) is a new sequence obtained by removing any element from a sequence without altering the order of the sequence, or, more simply, the position of the character of the former (substring) must be continuous, and the latter (the subsequence LCS). For example, the longest common substring of the string ACDFG with AKDFC is DF, and their longest common subsequence is ADF. LCS can be solved by using dynamic programming.

LCS Problem Solving ideas (DP algorithm):

1, in fact, the longest common sub-sequence problem also has the best sub-structure properties.

Remember:

Xi=﹤x1,?,xi﹥ is the first I character (1≤i≤m) (prefix) of the x sequence

Yj=﹤y1,?,yj﹥ is the first J character (1≤j≤n) (prefix) of the y sequence

Assume Z=﹤z1,?,zk﹥∈lcs (X, Y).

    • If Xm=yn(the last character is the same), it is not difficult to prove with contradiction that the character must be the last character of any of the longest common subsequence Z (set length k) of X and Y, that is, ZK = XM = yn and there is obviously a prefix of Zk-1∈lcs (Xm-1, Yn-1) that is Z Zk-1 is the longest common subsequence of Xm-1 and Yn-1 . At this point, the problem is attributed to the Xm-1 and Yn-1 LCS (thelength of the LCS (X, Y) equals the length of the LCS (Xm-1, Yn-1) plus 1).

    • If Xm≠yn, it is also not difficult to prove with contradiction: either Z∈lcs (Xm-1, Y), or Z∈lcs (X, Yn-1). Since ZK≠XM and Zk≠yn have at least one of them to be established, if ZK≠XM has Z∈lcs (Xm-1, Y), similarly, Zk≠yn (X, Z∈lcs). At this point, the problem is attributed to Xm-1 and y LCS and X and Yn-1 LCS. The length of the LCS (x, y) is: Max{lcs (Xm-1, y), length of LCS (X, Yn-1)}.

Because the length of the LCS (Xm-1, Y) and the length of the LCS (X, Yn-1) are not independent of each other in the case of Xm≠yn : Both require the length of the LCS (xm-1,yn-1). The two other sequences of LCS contain two sequence prefixes of LCS, so the problem has the optimal substructure properties considering the dynamic programming method.

In other words, to solve this LCS problem, you ask for three things:1, LCS (xm-1,yn-1) +1,2, LCS (Xm-1,y), LCS (x,yn-1),3, max{ LCS (Xm-1,y), LCS (x,yn-1)}.

2. The structure of the longest common subsequence is indicated as follows:

Set sequence x=<x1, X2, ..., xm> and y=<y1, Y2, ..., yn>, one of the longest common subsequence z=<z1, Z2, ..., zk>, then:

    1. If Xm=yn, then Zk=xm=yn and Zk-1 are the longest common subsequence of Xm-1 and Yn-1;
    2. If Xm≠yn and ZK≠XM, then Z is the longest common subsequence of Xm-1 and y;
    3. If Xm≠yn and Zk≠yn, Z is the longest common sub-sequence of x and Yn-1.

Among them xm-1=<x1, x2, ..., xm-1>,yn-1=<y1, y2, ..., yn-1>,zk-1=<z1, Z2, ..., zk-1>.

3, for example, set the given two sequences for x=<a,b,c,b,d,a,b> and y=<b,d,c,a,b,a>. As shown in the following:


To understand this figure, the LCS algorithm is almost understood:

<span style= "Color:rgb (51, 51, 51); >                             </span><span style= "color: #ff0000;" >   if (Str1.charat (i-1) ==str2.charat (j-1)) {dp[i][j]=dp[i-1][j-1]+1;} Else{dp[i][j]=math.max (Dp[i-1][j], dp[i][j-1]);} </span>

Code implementation:

public class lcs{public static void Main (string[] args) {//Set string length          int substringLength1 = 20;  int substringLength2 = 20;          The specific size can be set by itself//randomly generated string x = Getrandomstrings (substringLength1);             String y = getrandomstrings (substringLength2);          Long startTime = System.nanotime ();             Constructs a two-dimensional array to record sub-problems x[i] and y[i] The length of the LCS int[][] opt = new Int[substringlength1 + 1][substringlength2 + 1]; Dynamic programming calculates all sub-problems for (int i = substringlength1-1; I >= 0; i--) {for (int j = substringlength2-1 ; J >= 0;                                 j--) {if (X.charat (i) = = Y.charat (j)) Opt[i][j] = opt[i + 1][j + 1] + 1;                  Refer to the formula I gave above.        else opt[i][j] = Math.max (Opt[i + 1][j], Opt[i][j + 1]);              Refer to the formula I gave above. }} System.out.println (Opt[20][20]);} 











LCS (longest common sub-sequence) and DP (Dynamic planning)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.