Dynamic programming solves the problem of the longest common sub-sequence

Source: Internet
Author: User

lcss1= ' Mzjawxu ', s2= ' Xmjyauz ', careful analysis, it can be seen that the longest common subsequence is "Mjau", < Span style= "FONT-SIZE:14PX;" lang= "ZH-CN" > We need a rigorous scientific approach to solve them. This problem is dp problem (Dynamic planning problem). Dynamic programming Problem: the solution of the current problem relies on the previous sub-problem, and the last sub-problem is dependent on the previous sub-problem, and the sub-problem is infinite recursion.

Remember:

Xi=﹤x1,?,xi﹥ is the first I character (1≤i≤m) (prefix) of the x sequence

Yj=﹤y1,?,yj﹥ is the first J character (1≤j≤n) (prefix) of the y sequence

Assume Z=﹤z1,?,zk﹥∈lcs (X, Y).

    • If Xm=yn(the last character is the same), it is not difficult to prove with contradiction that the character must be the last character of any of the longest common subsequence Z (set length k) of X and Y, that is, ZK = XM = yn and there is obviously a prefix of Zk-1∈lcs (Xm-1, Yn-1) that is Z Zk-1 is the longest common subsequence of Xm-1 and Yn-1 . At this point, the problem is attributed to the Xm-1 and Yn-1 LCS (thelength of the LCS (X, Y) equals the length of the LCS (Xm-1, Yn-1) plus 1).

    • If Xm≠yn, it is also not difficult to prove with contradiction: either Z∈lcs (Xm-1, Y), or Z∈lcs (X, Yn-1). Since ZK≠XM and Zk≠yn have at least one of them to be established, if ZK≠XM has Z∈lcs (Xm-1, Y), similarly, Zk≠yn (X, Z∈lcs). At this point, the problem is attributed to Xm-1 and y LCS and X and Yn-1 LCS. The length of the LCS (x, y) is: Max{lcs (Xm-1, y), length of LCS (X, Yn-1)}.

Comments

As an example,

X=m Z J A W x u

Y=x M J y a u Z

Because X7!=y7, the largest common sequence of x, y two strings should be

X=m Z J A W x u x ' =m Z J a W x

Produced in Y ' =x,m,j,y,a,u or y=x m J y a U Z, i.e. in LCS (XM,YM-1) and LCS (XM-1,YM)

Because the length of the LCS (Xm-1, Y) and the length of the LCS (X, Yn-1) are not independent of each other in the case of Xm≠yn : Both require the length of the LCS (xm-1,yn-1). The two other sequences of LCS contain two sequence prefixes of LCS, so the problem has the optimal substructure properties considering the dynamic programming method.

In other words, to solve this LCS problem, you ask for three things:1, LCS (xm-1,yn-1) +1,2, LCS (Xm-1,y), LCS (x,yn-1),3, max{ LCS (Xm-1,y), LCS (x,yn-1)}.

The structure of the longest common subsequence is represented as follows:

Set sequence x=<x1, X2, ..., xm> and y=<y1, Y2, ..., yn>, one of the longest common subsequence z=<z1, Z2, ..., zk>, then:

    1. If Xm=yn, then Zk=xm=yn and Zk-1 are the longest common subsequence of Xm-1 and Yn-1;
    2. If Xm≠yn and ZK≠XM, then Z is the longest common subsequence of Xm-1 and y;
    3. If Xm≠yn and Zk≠yn, Z is the longest common sub-sequence of x and Yn-1.

Among them xm-1=<x1, x2, ..., xm-1>,yn-1=<y1, y2, ..., yn-1>,zk-1=<z1, Z2, ..., zk-1>.

The above two-dimensional array, the black arrow represents the same common element found, which should be in a common subsequence, this element and the first-oldest sequence of the preceding elements of the set, the entire longest sub-sequence. The value of each element represents the length of the longest common subsequence between each string, assuming the position of the current element (Xi,yj),If the horizontal and vertical characters are equal, then its value is equal to (xi-1,yj-1) +1, If the horizontal and vertical characters are not equal, then its value equals max ((Xi-1,yj), (xi,yj-1)), This two-dimensional array completely maps the recursive formula above. The matching of all substrings may all be shown, from x ,y The beginning of the string, recursively following the string. Finally, this two-dimensional array, the largest of the number is the length of the eldest son sequence.

#include <stdio.h>

   2

   3 int data[8][8]={0};                 //初始化为全是0

   4 char *s1= "mzjawxu" ;

   5 char *s2= "xmjyauz" ;

   6 void print( int i, int j);

   7 int main(){

   8    int i=1,j=1;

   9    for (i=1;i<8;i++){

  10      for (j=1;j<8;j++){

  11         if (*(s1+i-1)==*(s2+j-1)){     //如果两个元素相同,那么这个元素的值等于对角的元素的值加一

  12            data[i][j]=data[i-1][j-1]+1;

  13         } else {                        //如果两个元素不同,取左侧元素和上方元素的最大值,这个值也就是Xi-1,Yj和Xi,Yj-1公共字串的最大长度

  14            data[i][j]=data[i][j-1]>data[i-1][j]?data[i][j-1]:data[i-1][j];

  15         }

  16

  17      }

  18    }

  19    print(i,j);

  20 }

  21

                             //递归打印最大公共子序列

  22 void print( int i, int j){

  23   if (i==0||j==0)

  24       return ;

  25   if (*(s1+i-1)==*(s2+j-1)){

  26     print(i-1,j-1);

  27     printf ( "%c" ,*(s1+i-1));

  28   } else if (data[i][j-1]>data[i-1][j]){

  29     print(i,j-1);

  30   } else {

  31     print(i-1,j);

  32   }

  33 }

Dynamic programming solves the problem of the longest common sub-sequence

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.