Common subsequence
Time Limit: 2000/1000 MS (Java/others) memory limit: 65536/32768 K (Java/Others)
Total submission (s): 17390 accepted submission (s): 7290
Problem descriptiona subsequence of a given sequence is the given sequence with some elements (possible none) left out. given a sequence X = <x1, x2 ,..., XM> another sequence z = <Z1, Z2 ,..., ZK> is a subsequence of X if there exists a strictly increasing
Sequence <I1, I2 ,..., ik> of indices of X such that for all j = 1, 2 ,..., k, Xij = ZJ. for example, Z = <A, B, F, C> is a subsequence of X = <A, B, C, F, B, c> with index sequence <1, 2, 4, 6>. given two sequences x and y the problem is to find the length
The maximum-length common subsequence of X and Y.
The program input is from a text file. each data set in the file contains two strings representing the given sequences. the sequences are separated by any number of white spaces. the input data are correct. for each set of data the program prints on the Standard
Output the length of the maximum-length common subsequence from the beginning of a separate line.
Sample Input
abcfbc abfcabprogramming contest abcd mnp
Sample output
420
Sourcesoutheastern Europe 2003
Recommendignatius
Solution: This question is to find the longest common subsequence of two strings.
Introduce the processing method with public substrings
Longest Common substring (LCS)
Find the longest common substring of two strings, which must be continuous in the original string. In fact, this is a sequential decision-making problem, which can be solved using dynamic planning. We use a two-dimensional matrix to record the result in the middle. How to construct this two-dimensional matrix? For example: "Bab" and "Caba" (of course, we can see at a glance that the longest public substring is "ba" or "AB ")
B A B
C 0 0 0
A 0
1 0
B 1
0 1
A 0 1 0
We can see that the longest diagonal line of the matrix can find the longest common substring.
However, finding the longest diagonal line composed of 1 on a two-dimensional matrix is also time-consuming. The following improvements: when the matrix is filled with 1, make it equal to the element in the upper left corner of the matrix plus 1.
B A B
C 0 0 0
A 0 1 0
B 1 0 2
A 0 2 0
In this way, the maximum element in the matrix is the length of the longest common substring.
In the process of constructing the two-dimensional matrix, the previous row of the matrix is useless because a row of the matrix is obtained. In fact, the one-dimensional array can be used in the program to replace the matrix.
Similarly, the longest common subsequence is processed.
Longest Common subsequence
The difference between the longest common sub-sequence and the longest common sub-string is that the longest common sub-sequence does not need to be continuous in the original string. For example, the longest public sub-sequence of Ade and ABCDE is Ade.
We use the dynamic planning method to solve this problem. First, find the state transition equation:
In the equal sign convention, C1 is the rightmost character of S1, C2 is the rightmost character of S2, and S1 'is the part that removes C1 From S1, s2 'is the part that removes C2 from S2.
LCS (S1, S2) is equal to the following three items:
(1) LCS (S1, S2 ')
(2) LCS (S1 ', S2)
(3) LCS (S1 ', S2') -- If C1 is not equal to C2; LCS (S1 ', S2') + C1 -- If C1 is equal to C2;
Boundary termination condition: If S1 and S2 are both empty strings, the result is also empty strings.
Next we also need to build a matrix to store the solutions to the neutron problems in the dynamic planning process. Each number in this matrix represents the length of the row and the LCS before the column. In contrast to the status transition agenda just analyzed above, the number in each grid in the matrix should be filled in as follows, which is equal to the maximum value of the following three items:
(1) numbers in the above Grid
(2) number in a cell on the left
(3) number in the upper-left pane (if C1 is not equal to C2); Number + 1 in the upper-left pane (if C1 is equal to C2)
For example:
G C T
0 0 0 0 0
G 0 1 1 1 1
B 0 11
1 1
T 0 1 1 2 2
A 0 1 1 2 3
When entering the last number, it should be the following three referers:
(1) Number 2 above
(2) Number 2 on the left
(3) the number 2 + 1 = 3 in the upper left corner, because C1 = c2
Therefore, the final result is 3.
During the filling process, we record the cell from which the number of the current cell comes from, so that we can trace back to find the longest common substring. Sometimes there are multiple top left, top left, and top three at the same time to reach the maximum, so take one of them, but you must follow a fixed priority throughout the process. In my code, the priority is top left> top.
The process of finding LCS by Backtracking is given:
# Include <cstdio> # include <cstring> # include <algorithm> using namespace STD; int num [1002] [1002]; int main () {int I, J, K; char A [1002], B [1002]; while (scanf ("% s", )! = EOF) {scanf ("% s", B); int stra = strlen (a); int strb = strlen (B); memset (Num, 0, sizeof (Num); for (I = 1; I <= stra; I ++) {for (j = 1; j <= strb; j ++) {k = num [I-1] [J-1]; if (a [I-1] = B [J-1]) K ++; num [I] [J] = max (Num [I] [J-1], num [I-1] [J]), k ); // state equation} printf ("% d \ n", num [stra] [strb]);} return 0 ;}