The dynamic planning of reading notes in the introduction to algorithms-longest common subsequence & longest common substring (LCS)

Source: Internet
Author: User

from:http://my.oschina.net/leejun2005/blog/117167

1. The difference between the longest common sub-sequence & the longest common substring in first science:

Find the longest common substring of two strings, which is required to be contiguous in the original string. The longest common sub-sequence does not require continuous.

2. The longest common substring

In fact, this is a sequential decision-making problem, which can be solved by dynamic programming. We use a two-dimensional matrix to record intermediate results. How is this two-dimensional matrix structured? Just give an example: "Bab" and "Caba" (of course we can see at a glance that the longest common substring is "ba" or "AB")

b A B

C 0 0 0

A 0 1 0

B 1 0 1

A 0 1 0

We can find the longest common substring by looking at the longest diagonal of the matrix.

However, finding the longest diagonal line of 1 on a two-dimensional matrix is also a cumbersome and time-consuming thing, as follows: When you want the matrix to be filled 1 o'clock, let it be equal to its upper-left corner element plus 1.

b A B

C 0 0 0

A 0 1 0

B 1 0 2

A 0 2 0

The largest element in this matrix is the length of the longest common substring.

In the process of constructing this two-dimensional matrix, it is useless to get a row of the matrix after it is obtained, so it is actually possible to replace the matrix with one-dimensional array in the program.

The

2.1 code is as follows:

public class LCString2 {public static void Getlcstring (char[] str1, char[] str2) {int I, J;        int len1, len2;        Len1 = Str1.length;        Len2 = Str2.length; int maxlen = len1 > len2?        Len1:len2;        int[] max = new Int[maxlen];        int[] Maxindex = new Int[maxlen]; Int[] C = new Int[maxlen];  Record the number of equal values on the diagonal for (i = 0; i < len2; i++) {for (j = len1-1; J >= 0; j--) {if (Str2[i] = = Str1[j]) {if ((i = = 0) | |                        (j = = 0))                    C[J] = 1;                else c[j] = c[j-1] + 1;                } else {c[j] = 0;                    } if (C[j] > Max[0]) {//if is greater than that temporarily only one is the longest, and to put the back of the Qing 0; Max[0] = C[j]; Record the maximum value of the diagonal element, after which the length of the substring is used as the extraction string maxindex[0] = j;              Record the position of the maximum value of the diagonal element for (int k = 1; k < maxlen; k++) {max[k] = 0;          Maxindex[k] = 0;                        }} else if (c[j] = = Max[0]) {//There are multiple substrings of the same length for (int k = 1; k < maxlen; k++) {                            if (max[k] = = 0) {Max[k] = c[j];                            Maxindex[k] = j; Break  Add one at the back to exit the Loop.}}}} for (j = 0; J < MaxLen;                J + +) {if (Max[j] > 0) {System.out.println ("+ (j + 1) +" Common substring: ");                for (i = maxindex[j]-max[j] + 1; I <= maxindex[j]; i++) System.out.print (Str1[i]);            System.out.println ("");        }}} public static void Main (string[] args) {string str1 = new String ("123456abcd567");        String str2 = new String ("234dddabc45678");        String str1 = new String ("AAB12345678CDE"); String str2 = new String ("Ab1234yb1234567 ");    Getlcstring (Str1.tochararray (), Str2.tochararray ()); }}

  

Ref

The Java algorithm for LCS---consider that there may be multiple identical longest common substrings

http://blog.csdn.net/rabbitbug/article/details/1740557

Maximum subsequence, longest increment subsequence, longest common substring, longest common subsequence, string edit distance

Http://www.cnblogs.com/zhangchaoyang/articles/2012070.html

2.2 In fact, awk is easy to write:

echo "123456abcd567234dddabc45678" |awk-vfs= "" ' Nr==1{str=$0}nr==2{n=nf;for (n=0;n++<n;) {s= ""; for "t=n;t<=N;t + +) {s=s "" $t, if (Index (str,s)) {a[n]=t-n;b[n]=s;if (M<=a[n]) m=a[n]}else{t=n}}}}end{for (n=0;n++<n;) if (a[n]==m ) Print B[n]} '

Ref:http://bbs.chinaunix.net/thread-4055834-2-1.html

3, the longest common sub-sequence

Import Java.util.Random; public class LCS {public static void main (string[] args) {//randomly generated string//String x = Getrandomstrings (s        UBSTRINGLENGTH1);        String y = getrandomstrings (substringLength2);        String x = "A1B2C3";        String y = "1a1wbz2c123a1b2c123";        Sets the string length int substringLength1 = X.length (); int substringLength2 = Y.length (); Specific size can be set by itself//construct two-dimensional array record sub-problem x[i] and Y[i] LCS length int[][] opt = new Int[substringlength1 + 1][substringlength2 +         1]; From the back forward, dynamic planning calculates all sub-problems.        can also be from the front to the back.                for (int i = substringlength1-1, i >= 0; i--) {for (int j = substringlength2-1; J >= 0; j--) { if (X.charat (i) = = Y.charat (j)) Opt[i][j] = opt[i + 1][j + 1] + 1;//state transfer equation El Se opt[i][j] = Math.max (Opt[i + 1][j], Opt[i][j + 1]);//state Transfer equation}} System.out        . println ("substring1:" + x); System.out.println ("substring2:" + y);         System.out.print ("LCS:");        int i = 0, j = 0;                while (I < substringLength1 && J < substringLength2) {if (X.charat (i) = = Y.charat (j)) {                System.out.print (X.charat (i));                i++;            j + +;            } else if (Opt[i + 1][j] >= opt[i][j + 1]) i++;        else J + +; }}//Get fixed-length random string public static string getrandomstrings (int length) {StringBuffer buffer = new Stringbuff        ER ("abcdefghijklmnopqrstuvwxyz");        StringBuffer sb = new StringBuffer ();        Random r = new Random ();        int range = Buffer.length ();        for (int i = 0; i < length; i++) {Sb.append (Buffer.charat (R.nextint (range)));    } return sb.tostring (); }}

  

REF:

The maximum common subsequence of a string and the maximum common substring problem

http://gongqi.iteye.com/blog/1517447

Dynamic programming algorithm for solving the longest common subsequence LCS problem

http://blog.csdn.net/v_JULY_v/article/details/6110269

The dynamic planning of reading notes in the introduction to algorithms-longest common subsequence & longest common substring (LCS)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.