Java Dynamic Programming implements the longest public subsequence and longest public substring

Source: Internet
Author: User

Dynamic Programming

Complex problems often occur. Instead of simply breaking down them into several subproblems, they may break down a series of subproblems. Simply resolve a large problem into a sub-problem, and combine the sub-problem solution to export the solution of the big problem. the time consumed for solving the problem increases in a power series according to the scale of the problem.

To reduce the time required to repeatedly find the same subproblem, an array is introduced, no matter whether they are useful for the final solution or not, to resolve all subproblems in the array, this is the basic method used by dynamic programming.

[Problem] calculates the longest common character subsequence of a two-character sequence.

Problem description: The subsequence of a character sequence is a character sequence formed by removing a number of characters (either one or not) from a given Character Sequence at Will (not necessarily consecutive. Make the given character sequence X = "x0, X1 ,..., Xm-1 ", sequence y =" y0, Y1 ,..., Yk-1 is a subsequence of X, there is a strictly incrementing subscript sequence of x <I0, i1 ,..., Ik-1>, making for all J = ,..., K-1 with Xij = YJ. For example, x = "abcbdab" and Y = "bcdb" are subsequences of X.

Consider how to break down the longest common subsequence into sub-problems, set a = "A0, A1 ,..., Am-1 ", B =" B0, B1 ,..., Bm-1 ", and z =" z0, Z1 ,..., Zk-1 "is their longest common subsequence. It is not hard to prove that it has the following features:

(1) If am-1 = bn-1, then zk-1 = Am-1 = bn-1, and "z0, Z1 ,..., Zk-2 "is" A0, A1 ,..., Am-2 "and" B0, B1 ,..., A Longest Common subsequence of bn-2;

(2) If am-1! = Bn-1, if zk-1! = Am-1, containing "z0, Z1 ,..., Zk-1 "is" A0, A1 ,..., Am-2 "and" B0, B1 ,..., A Longest Common subsequence of bn-1;

(3) If am-1! = Bn-1, if zk-1! = Bn-1, contains "z0, Z1 ,..., Zk-1 "is" A0, A1 ,..., Am-1 "and" B0, B1 ,..., A Longest Common subsequence of bn-2.

In this way, in the search for a and B Public subsequences, if there is am-1 = bn-1, then further solve a subproblem, find "A0, A1 ,..., Am-2 "and" B0, B1 ,..., A Longest Common subsequence of bm-2; If am-1! = Bn-1, it is to solve two sub-problems, find out "A0, A1 ,..., Am-2 "and" B0, B1 ,..., Bn-1 "of a Longest Common subsequence and finding out" A0, A1 ,..., Am-1 "and" B0, B1 ,..., The longest common subsequence of bn-2, and the elders of the two are used as the longest common subsequence of A and B.

Solution:

Introduce a two-dimensional array C [] [], and use C [I] [J] to record the LCS length of X [I] AND Y [J, B [I] [J] records C [I] [J] through which the subproblem value is obtained to determine the search direction.
So before C [I, j] is calculated, C [I-1] [J-1], c [I-1] [J] and C [I] [J-1] have been calculated. In this case, we can determine whether X [I] = Y [J] Or X [I]. = Y [J] to calculate C [I] [J].

The recursive expression of the problem is as follows:


Process of backtracking output Longest Common subsequence:

 

Algorithm analysis:
Since each call moves at least one step up or to the left (or to the left at the same time), I = 0 or J = 0 will occur when you call (M * n) at most, return starts at this time. The return time is the opposite to the recursive call time. Because the number of steps is the same, the algorithm time complexity is merge (M * n ).

Java code implementation:

Public class lcsproblem {public static void main (string [] ARGs) {// The Null String is reserved for the integrity of the getlength () method. You can also leave it unretained. // but in getlength () the method must initialize the first string [] x = {"", "A", "B", "C ", "B", "D", "A", "B"}; string [] Y = {"", "B", "D", "C ", "A", "B", "a"}; int [] [] B = getlength (x, y); display (B, X, X. length-1, Y. length-1 );} /*** @ Param x * @ Param y * @ return returns an array of records that determine the search direction */public static int [] [] getlength (string [] X, string [] Y) {int [] [] B = new int [X. length] [Y. length]; int [] [] C = new int [X. length] [Y. length]; for (INT I = 1; I <X. length; I ++) {for (Int J = 1; j <Y. length; j ++) {// corresponds to the first property if (X [I] = Y [J]) {c [I] [J] = C [I-1] [J-1] + 1; B [I] [J] = 1 ;} // corresponds to the second or third property else if (C [I-1] [J]> = C [I] [J-1]) {c [I] [J] = C [I-1] [J]; B [I] [J] = 0 ;} // corresponds to the second or third else {C [I] [J] = C [I] [J-1]; B [I] [J] =-1 ;}}} return B;} // The basic implementation of backtracking. Public static void display (INT [] [] B, string [] X, int I, Int J) is used recursively) {if (I = 0 | j = 0) return; If (B [I] [J] = 1) {display (B, X, I-1, j-1); system. out. print (X [I] + "");} else if (B [I] [J] = 0) {display (B, X, I-1, J );} else if (B [I] [J] =-1) {display (B, X, I, J-1 );}}}

Longest Common substring: It is similar to the oldest sequence, but the public substring must be continuous.
The Java implementation code is as follows:

Public class stringcompare {// In the dynamic planning matrix generation mode, each row is generated, and the previous row is useless. Therefore, you only need to use a one-dimensional array, instead of the commonly used two-byte array public static void getlcstring (char [] str1, char [] str2) {int len1, len2; len1 = str1.length; len2 = str2.length; int maxlen = len1> len2? Len1: len2; int [] Max = new int [maxlen]; // Save the array int [] maxindex = new int [maxlen]; // Save the int [] C = new int [maxlen]; int I, j; for (I = 0; I <len2; I ++) {for (j = len1-1; j> = 0; j --) {If (str2 [I] = str1 [J]) {If (I = 0) | (j = 0) c [J] = 1; elsec [J] = C [J-1] + 1; // At this time C [J-1] or the value in the previous loop, because it has not been re-assigned} else {C [J] = 0 ;} // if it is greater than, only one of them is the longest, and the following values must be cleared. If (C [J]> MAX [0]) {max [0] = C [J]; maxindex [0] = J; For (int K = 1; k <maxlen; k ++) {max [k] = 0; maxindex [k] = 0 ;}} // There are multiple substrings with the same length as else if (C [J] = MAX [0]) {for (int K = 1; k <maxlen; k ++) {If (MAX [k] = 0) {max [k] = C [J]; maxindex [k] = J; break; // Add one to the end to exit the loop. }}} for (INT temp: C) {system. out. print (temp);} system. out. println () ;}// print the eldest son string for (j = 0; j <maxlen; j ++) {If (MAX [J]> 0) {system. out. println ("nth" + (J + 1) + "Public substrings:"); for (I = maxindex [J]-Max [J] + 1; I <= maxindex [J]; I ++) system. out. print (str1 [I]); system. out. println ("") ;}} public static void main (string [] ARGs) {string str1 = new string ("binghaven "); string str2 = new string ("jingseven"); getlcstring (str1.tochararray (), str2.tochararray ());}}

Output:

000000000
010000000
002000001
000300000
000000000
000000010
000000100
000000020
001000003
1st Public substrings:
Ing
2nd public substrings:
Ven

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.