Longest common sub-sequence (LCS)

Source: Internet
Author: User

a string s, minus 0 or more elements remaining substring is called the subsequence of S. The longest common subsequence is the search for the subsequence column of two given sequence , which appears in the same order in the two sequences, but is not necessarily contiguous.

For example, sequence X=abcbdab,y=bdcaba. The sequence BCA is a common sub-sequence of x and Y, but is not the longest common subsequence of x and Y, the child

The sequence BCBA is an LCS of X and Y, and the sequence Bdab is also. One way to find LCS is to enumerate all of the X's subsequence, and then check to see if Y is a subsequence

column and record the oldest sequence found at any time. Assuming that x has m elements, X has a 2^m subsequence, a time of magnitude, and a long sequence is impractical.

Using dynamic programming to solve this problem, we first look for the optimal substructure. Set x=<x1,x2,..., xm> and Y=<y1,y2,..., yn> for two sequences,

LCS (x, y) represents one of the longest common sub-sequences of X and Y, as you can see

  1, if x m =y n m  + LCS (X m-1 ,Y " Span style= "font-family:kaiti_gb2312; font-size:18px; Line-height:18.8500003814697px ">N-1&NBSP; ).

  2, if x m !=y n m-1 Span style= "font-family:kaiti_gb2312; Font-size:18px ", y), LCS (X, y n-1  )}

The LCS problem also has overlapping sub-problem properties: To find an LCS for x and Y, you may need to find an LCS for X and Yn-1 and one for Xm-1 and y

LCS. But these two sub-problems includean LCS for xm-1 and yn-1 , and so on. DP Final processing or numerical (extremum to do the best solution), found the most

Optimum value, we find the best solution; in order to find the longest LCS, we define Dp[i][j] Records the length of the sequence LCS, the initial value of the legal state is when the sequence

The length of x is 0 or y is 0, and the common subsequence LCS length is 0, i.e. dp[i][j]=0, so use I and J to denote the length of sequence x and the long of sequence y, respectively .

degree , the state transition equation is

    1. DP[I][J] = 0 if i=0 or j=0
    2. DP[I][J] = dp[i-1][j-1] + 1 if x[i-1] = y[i-1]
    3. DP[I][J] = max{dp[i-1][j], dp[i][j-1]} if x[i-1]! = y[i-1]

after finding out the length of the longest common subsequence, the output LCS is the optimal solution for the output DP, which can be used with an additional The matrix storage path, or you can directly

Then, the optimal scheme is inverted based on the state transition matrix. The code is as follows:


Import java.util.random;/* * Longest common sub-sequence */public class lcs{char[][] b;int[][] C;          public static void Main (string[] args) {//Set string length int substringLength1 = 10;  int substringLength2 = 15;          The specific size can be set by itself//randomly generated string x = Getrandomstrings (substringLength1);             String y = getrandomstrings (substringLength2);          Long startTime = System.nanotime ();          Constructs two-dimensional array to record sub-problems x[i] and y[i] The length of the LCS System.out.println ("substring1:" +x);          System.out.println ("Substring2:" +y);        X= "" +X;        Y= "" +y;        LCS Lcs=new LCS ();        Lcs.lcslength (x, y);    Lcs.printlcs (x, X.length ()-1, Y.length ()-1);   } public void Lcslength (String x,string y) {int m=x.length ();   int N=y.length ();   B=new Char[m][n];   C=new int[m+1][n+1];   for (int i=1;i<m+1;i++) {c[i][0]=0;   } for (int i=0;i<m+1;i++) {c[0][i]=0; } for (int i=1;i<m;i++) {for (int j=1;j<n;j++) {if (X.charat (i) ==y.charat (j)){c[i][j]=c[i-1][j-1]+1;   b[i][j]= ' m ';   }else if (C[i-1][j]>=c[i][j-1]) {c[i][j]=c[i-1][j];   b[i][j]= ' t ';   }else {c[i][j]=c[i][j-1];   B[i][j]= ' R '; }}}} public void Printlcs (String x,int i,int j) {if (i==0| |   j==0) return;   if (b[i][j]== ' m ') {Printlcs (x, I-1, j-1);   System.out.print (X.charat (i));   }else if (b[i][j]== ' t ') Printlcs (x, I-1, J);      else Printlcs (x, I, j-1); }//Get fixed-length random string public static string getrandomstrings (int length) {StringBuffer buffer = new StringBuffer ("          ABCDEFGHIJKLMNOPQRSTUVWXYZ ");          StringBuffer sb = new StringBuffer ();          Random r = new Random ();          int range = Buffer.length ();          for (int i = 0; i < length; i++) {Sb.append (Buffer.charat (R.nextint (range)));      } return sb.tostring ();   }  }


Longest common sub-sequence (LCS)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.