LCS (longest common subsequence) is the problem of finding the longest public substring of two strings.
Link: http://blog.csdn.net/zztfj/article/details/6157429
For example:
String str1 = new string ("adbccadebbca ");
String str2 = new string ("edabccadece ");
The common substring of str1 and str2 is bccade.
The solution is to use a matrix to record the matching conditions between the two characters at all positions in two strings. If it matches, it is 1; otherwise, it is 0. Then we can find the longest 1 series of diagonal lines. The corresponding position is the longest position matching the substring.
The following is the matching matrix between string 21232523311324 and string 312123223445. The former is in the X direction and the latter is in the Y direction. It is not hard to find. The red part is the longest matching substring. The longest matching substring is 21232.
0 0 0 1 0 0 1 1 0 0 1 0 0 0
0 1 0 0 0 0 0 0 1 1 0 0 0 0
1 0 1 0 1 0 1 0 0 0 0 1 0 0
0 1 0 0 0 0 0 0 1 1 0 0 0 0
1 0 1 0 1 0 1 0 0 0 0 1 0 0
0 0 0 1 0 0 1 1 0 0 1 0 0 0
1 0 1 0 1 0 1 0 0 0 0 1 0 0
1 0 1 0 1 0 1 0 0 0 0 1 0 0
0 0 0 1 0 0 1 1 0 0 1 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 1 0
0 0 0 0 0 0 0 0 0 0 0 0 1 0
0 0 0 0 0 1 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0
However, it takes some time to find the longest diagonal series of 1 in the matrix of 0 and 1. By improving the matrix generation method and setting tag variables, you can save this time. The new matrix generation method is as follows:
0 0 0 1 0 0 1 1 0 0 1 0 0 0
0 1 0 0 0 0 0 0 0 2 1 0 0 0
1 0 2 0 1 0 1 0 0 0 0 1 0 0
0 2 0 0 0 0 0 0 1 1 0 0 0 0
1 0 3 0 1 0 1 0 0 0 0 1 0 0
0 0 0 4 0 0 0 2 1 0 1 0 0 0
1 0 1 0 5 0 1 0 0 0 0 2 0 0
1 0 1 0 1 0 1 0 0 0 0 1 0 0
0 0 0 2 0 0 2 1 0 0 1 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 1 0
0 0 0 0 0 0 0 0 0 0 0 0 1 0
0 0 0 0 0 1 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0
When matching a character, we do not simply assign 1 to the corresponding element, but the value of the element in the upper left corner plus one. We use two marking variables to mark the position of the element with the largest median value in the Matrix. During the matrix generation process, we can determine whether the value of the currently generated element is the largest. Based on this, we can change the value of the marking variable, by the time the matrix is complete, the longest position and length of the matched substring have come out. The specific algorithm is as follows:
Void LCS (char a [], char B [], int len1, int len2) {assert (len1> 0 & len2> 0); int max = 0; // mark the maximum length of Public substrings int ROW = 0, Col = 0; // mark the columns of Public substrings int ** c = new int * [len2 + 1]; for (INT I = 0; I <len2 + 1; ++ I) C [I] = new int [len1 + 1]; for (INT I = 0; I <len2 + 1; ++ I) // assign the initial value for (Int J = 0; j <len1 + 1; ++ J) c [I] [J] = 0; For (INT I = 1; I <len2 + 1; ++ I) for (Int J = 1; j <len1 + 1; + + J) if (B [I-1] = A [J-1]) {C [I] [J] = C [I-1] [J-1] + 1; if (C [I] [J]> MAX) {max = C [I] [J]; ROW = I; Col = J ;}} for (int K = row-Max; k <= row-1; k ++) cout <B [k];}
This is faster, but it takes too much space. We noticed that in the improved matrix generation method, each row is generated, and the previous row is useless. Therefore, we only need to use a one-dimensional array. The final code is as follows:
Public class lcstring2 {
Public static void getlcstring (char [] str1, char [] str2)
{
Int I, J;
Int len1, len2;
Len1 = str1.length;
Len2 = str2.length;
Int maxlen = len1> len2? Len1: len2;
Int [] Max = new int [maxlen];
Int [] maxindex = new int [maxlen];
Int [] C = new int [maxlen];
For (I = 0; I <len2; I ++)
{
For (j = len1-1; j> = 0; j --)
{
If (str2 [I] = str1 [J])
{
If (I = 0) | (j = 0 ))
C [J] = 1;
Else
C [J] = C [J-1] + 1;
}
Else
{
C [J] = 0;
}
If (C [J]> MAX [0])
{// If it is greater than that, only one of them is the longest at the moment, and the following values should be cleared;
Max [0] = C [J];
Maxindex [0] = J;
For (int K = 1; k <maxlen; k ++)
{
Max [k] = 0;
Maxindex [k] = 0;
}
}
Else if (C [J] = MAX [0])
{// There are multiple substrings of the same length
For (int K = 1; k <maxlen; k ++)
{
If (MAX [k] = 0)
{
Max [k] = C [J];
Maxindex [k] = J;
Break; // Add one to the backend and exit the loop.
}
}
}
}
}
For (j = 0; j <maxlen; j ++)
{
If (MAX [J]> 0)
{
System. Out. println ("th" + (J + 1) + "Public substrings :");
For (I = maxindex [J]-Max [J] + 1; I <= maxindex [J]; I ++)
System. Out. Print (str1 [I]);
System. Out. println ("");
}
}
}
Public static void main (string [] ARGs ){
String str1 = new string ("adbba1234 ");
String str2 = new string ("adbbf1234sa ");
Getlcstring (str1.tochararray (), str2.tochararray ());
}
}