Eldest Son string (Dynamic Planning)

Source: Internet
Author: User
Tags kinit

Question: If all the characters in string 1 appear in the order of the strings in another string,

Then, string 1 is called a substring of string 2.

Note that the character of a substring (string 1) must appear in string 2 consecutively.

Compile a function, enter two strings, calculate their longest public substrings, and print the longest public substrings.

For example, input two strings: bdcaba and abcbdab. Both bcba and bdab are their longest common substrings,

The output length is 4 and any substring is printed.

 

Analysis: finding the longest common substring (LCS) is a very classic dynamic planning question.

For the following analysis, see another blog.

Step 1. Describe a Longest Common subsequence

First introduce the nature of LCS problems: Note XM = {x0, X1 ,... Xm-1} and YN = {y0, Y1 ,..., Yn-1} is two strings,

Set zk = {z0, Z1 ,... Zk-1} is any LCS of X and Y, three properties can be obtained:

1. If the xm-1 = yn-1, then the zk-1 = xm-1 = yn-1, And the Zk-1 is an lcs of the Xm-1 and Yn-1;

2. If the xm-1 is not yn-1, then when the zk-1 is not xm-1, z is Xm-1 and Y LCs;

3. If xm-1 is less than yn-1, then when zk-1 is less than yn-1, z is the LCS of x and Yn-1;

 

Below is a simple proof of these properties derived from the above conditions:

1. if the zk-1 is not xm-1, then we can add the xm-1 (yn-1) to Z to get Z', so that we can get a length of X and Y is k + 1 of the Public substring Z '.

This is in conflict with Z whose length is K and LCs of X and Y. So there must be zk-1 = xm-1 = yn-1.

Since zk-1 = xm-1 = yn-1, if we delete the zk-1 (xm-1, yn-1) to get the Zk-1, Xm-1 and Yn-1, apparently the Zk-1 is a public substring of the Xm-1 and Yn-1, now we prove that the Zk-1 is Xm-1 and the Yn-1 of LCS. It is not difficult to prove it by using the reverse verification method. Suppose there is a Xm-1 and a Yn-1 with a public substring W longer than the K-1, then we add it to W to get W', then W' is the public substring of X and Y, and the length exceeds K, which is in conflict with known conditions.

2. Verify it by Reverse verification. If Z is not the Xm-1 and Y of LCS, there is a length more Than k W is the Xm-1 and Y of LCS, then W must also X and y of the public substring, in the known conditions, the maximum length of the Public substrings X and Y is K. Conflict.

3. The proof is the same as 2.

 

Step 2. a recursive Solution

Based on the above nature, we can come up with the following ideas:

Evaluate the two strings XM = {x0, X1 ,... Xm-1} and YN = {y0, Y1 ,..., Yn-1} LCS,

If the xm-1 = yn-1, then just get the Xm-1 and the Yn-1 of LCS, and add the xm-1 (yn-1) after it (the above properties 1 );

If the xm-1 is not yn-1, we obtain the LCS of Xm-1 and Y and the LCS of Yn-1 and x respectively, in addition, the long LCS of the two LCS is X and Y (the above properties are 2 and 3 ).

 

According to the above conclusions, the following formula can be obtained,

If we remember that the length of the LCS of string XI and YJ is C [I, j], we can recursively calculate C [I, j]:

/0 if I <0 or j <0

C [I, j] = C [I-1, J-1] + 1 if I, j> = 0 and xi = XJ

/MAX (C [I, J-1], C [I-1, J] If I, j> = 0 and xi = XJ

 

The above formula is not difficult to obtain using recursive functions. Naturally, we can see from the solution to the n-th question (question 100 of the 19th question series, such as Microsoft, v0.1) of the Fibonacci,

Direct recursion involves a lot of repeated computations. Therefore, it is more efficient to use the bottom-up and upward-loop solution.

 

In order to be able to use the idea of loop solution, we use a matrix (refer to lcs_length in the code at the end of the following section) to save the computed C [I, j],

When the subsequent computation requires the data, the data can be directly read from the matrix.

 

In addition, C [I, j] can be calculated from C [I-1, J-1], C [I, J-1] or C [I-1, J,

It is equivalent to moving one of the two in the matrix lcs_length from C [I-1, J-1], C [I, J-1] or C [I-1, J] to C [I, j],

Therefore, there are three different moving directions in the matrix: left, up, and top left. Only moving to the top left indicates that one character in LCS is found.

So we need to use another matrix (refer to lcs_direction in the code at the end of the following) to save the moving direction.

The following figure shows the C ++ implementation source code after the modification:

// Dynamic plan_maximum substring. cpp: defines the entry point of the console application. // # Include "stdafx. H "# include <string >#include <iostream> using namespace STD; Enum decreasedir {kinit = 0, kleft, Kup, kleftup}; void lcs_print (INT ** lcs_dirction, string pstr1, string pstr2, int row, int col); int LCS (string pstr1, string pstr2) {// If (! Pstr1 |! Pstr2) return 0; int length1 = pstr1.length (); int lengh2 = pstr2.length (); If (! Length1 |! Lengh2) return 0; int I, j; int ** lcs_length; lcs_length = (INT **) (new int [length1]); for (I = 0; I <length1; I ++) lcs_length [I] = (int *) New int [lengh2]; for (I = 0; I <length1; ++ I) for (j = 0; j <leng2; ++ J) lcs_length [I] [J] = 0; // initialize the length matrixint ** lcs_dirction; lcs_dirction = (INT **) (new int [length1]); for (I = 0; I <length1; ++ I) lcs_dirction [I] = (int *) New int [lengh2]; for (I = 0; I <length1; ++ I) for (j = 0; j <length1; ++ J) lcs_dirction [I] [J] = Kinit; // initialize dirction matrixfor (I = 0; I <length1; ++ I) {for (j = 0; j <length1; ++ J) {if (I = 0 | j = 0) {If (pstr1 [I] = pstr2 [J]) {lcs_length [I] [J] = 1; lcs_dirction [I] [J] = kleftup;} else lcs_length [I] [J] = 0;} else if (pstr1 [I] = pstr2 [J]) {lcs_length [I] [J] = lcs_length [I-1] [J-1] + 1; lcs_dirction [I] [J] = kleftup ;} else if (lcs_length [I-1] [J]> lcs_length [I] [J-1]) {lcs_length [I] [J] = lcs_length [I-1] [J]; lcs_dirction [I] [J] = Kup;} else {lcs_len Direction [I] [J] = lcs_length [I] [J-1]; lcs_dirction [I] [J] = kleft ;}} lcs_print (lcs_dirction, pstr1, pstr2, length1-1, length2-1); Return lcs_length [length1-1] [length2-1];} void lcs_print (INT ** lcs_dirction, string pstr1, string pstr2, int row, int col) {// If (pstr1 = NULL | pstr2 = NULL) return; int length1 = pstr1.length (); int length1 = pstr2.length (); if (length1 = 0 | length1 = 0 |! (Row <length1 & Col <length1) return; If (lcs_dirction [row] [col] = kleftup) {If (row> 0 & Col> 0) lcs_print (lcs_dirction, pstr1, pstr2, row-1, col-1); printf ("% C", pstr1 [row]);} else if (lcs_dirction [row] [col] = kleft) {If (COL> 0) lcs_print (lcs_dirction, pstr1, pstr2, row, col-1 );} else if (lcs_dirction [row] [col] = Kup) {If (row> 0) lcs_print (lcs_dirction, pstr1, pstr2, row-1, col );}} int _ tmain (INT argc, _ tchar * argv []) {string str1 = "bdcaba"; // char str1 [] = {'B', 'D ', 'C', 'A', 'B', 'A'}; string str2 = "abcbdab"; // char str2 [] = {'A', 'B ', 'C', 'B', 'D', 'A', 'B'}; cout <"the largest substring is:" <Endl; int length = LCS (str1, str2); cout <Endl <"Maximum substring length:" <length <Endl; int K = 0; cin> K; return 0 ;}

Program running:

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.