Java implementation of string matching problems two maximum common substrings of a string

Source: Internet
Author: User

Reprint Please specify source: http://blog.csdn.net/xiaojimanman/article/details/38924981

Recently in the project work there is a text-to-control requirements, after this period of study, summed up this blog content: two strings to find the largest common substring.

Algorithm idea: Calculates the common substring of two strings based on the graph. Detailed algorithm ideas:




Input string S1:ACHMACMH input string S2:macham

1) Step A is to divide the string s1,s2 into a two-dimensional array, respectively, by byte;

2) The values in the two-dimensional array as seen in B, for example, the first row of the first column of the value of the string S2 and s1 the first byte is equal, if the equality is 1, otherwise is 0, and finally produce B to see the two-dimensional array;

3) The common factor on the diagonal of the two-dimensional array (slash is the value of the lower-right corner of element A, that is, the next element of A[i][j] is a[i+1][j+1]; the public factor is a string of the position of the 1);

4) Sort all public factors and return the value of the largest public factor.


The detailed implementation code looks like the following:

Package Cn.lulei.compare;import Java.util.arraylist;import Java.util.collections;import java.util.Comparator; Import Java.util.list;public class Stringcompare {private int a;private int b;public String getmaxlengthcommonstring ( String s1, string s2) {if (S1 = = NULL | | s2 = = NULL) {return null;} A = S1.length ();//s1 length do line B = s2.length ()//s2 length do column if (a== 0 | | b = = 0) {return "";} Set the matching matrix Boolean [] array = new Boolean[a][b];for (int i = 0; i < A; i++) {Char C1 = S1.charat (i); for (int j = 0; J & Lt b  J + +) {Char C2 = S2.charat (j); if (c1 = = C2) {Array[i][j] = true;} else {Array[i][j] = false;}}} For all common factor strings, save the information as the starting position and length of the relative second string list<childstring> childstrings = new arraylist<childstring> (); for (int i = 0; i < A; i++) {getmaxsort (i, 0, array, childstrings);} for (int i = 1; i < b; i++) {getmaxsort (0, I, array, childstrings);} Sorting sort (childstrings), if (Childstrings.size () < 1) {return "";} Returns the maximum male factor string int max = Childstrings.get (0). MaxLength; StringBuffer sb = new StringBuffer(); for (childstring s:childstrings) {if (max! = s.maxlength) {break;} Sb.append (S2.substring (S.maxstart, S.maxstart + s.maxlength)); Sb.append ("\ n");} return sb.tostring ();} Sort, flashback private void sort (list<childstring> list) {Collections.sort (list, new comparator<childstring> () { public int Compare (childstring O1, childstring O2) {return o2.maxlength-o1.maxlength;}}); A common factor string on a slash private void getmaxsort (int i, int J, Boolean [] [] array, list<childstring> sortbean) {int length = 0 ; int start = J;for (; i < a && J < b; i++,j++) {if (Array[i][j]) {length++;} else {Sortbean.add (new childstr ing (length, start)); length = 0;start = j + 1;} if (i = = A-1 | | j = b-1) {sortbean.add (new childstring (length, start));}}} Male factor class Childstring {int Maxlength;int maxstart; childstring (int maxLength, int maxstart) {this.maxlength = Maxlength;this.maxstart = Maxstart;}} /** * @param args */public static void main (string[] args) {//TODO auto-generated method StubSystem.out.println (new Stringcompare (). getmaxlengthcommonstring ("ACHMACMH", "Macham"));}} 

The program finally runs the result:


For more than two files of the individual feel able to participate in such an algorithm idea (oneself now and to achieve), in the future blog will be written.


In the above implementation process, the use of the array to save all the common substring information, and then sort the largest substring, so that the assumption is only the maximum substring, the algorithm is not very reasonable, so for example, the following changes, the list only saves the current calculation of the largest substring, detailed implementation such as the following:

 /** * @Description: string comparison */package Com.lulei.test;import Java.util.arraylist;import Java.util.list;public class Str ingcompare {private int a;private int b;private int maxLength = -1;public string getmaxlengthcommonstring (string s1, Strin G S2) {if (S1 = = NULL | | s2 = = NULL) {return null;} A = S1.length ();//s1 length do line B = s2.length ()//s2 length do column if (a== 0 | | b = = 0) {return "";} Set the matching matrix Boolean [] array = new Boolean[a][b];for (int i = 0; i < A; i++) {Char C1 = S1.charat (i); for (int j = 0; J & Lt b  J + +) {Char C2 = S2.charat (j); if (c1 = = C2) {Array[i][j] = true;} else {Array[i][j] = false;}}} For all common factor strings, save the information as the starting position and length of the relative second string list<childstring> childstrings = new arraylist<childstring> (); for (int i = 0; i < A; i++) {getmaxsort (i, 0, array, childstrings);} for (int i = 1; i < b; i++) {getmaxsort (0, I, array, childstrings);} StringBuffer sb = new StringBuffer (); for (childstring s:childstrings) {sb.append (s2.substring (S.maxstart, S.maxStart + s) . maxLength)); Sb.append ("\ n");}return sb.tostring ();} A common factor string on a slash private void getmaxsort (int i, int J, Boolean [] [] array, list<childstring> sortbean) {int length = 0 ; int start = J;for (; i < a && J < b; i++,j++) {if (Array[i][j]) {length++;} else {//Direct add, save all substrings, the following inference, just save Current maximum substring//sortbean.add (new childstring (length, start)), if (length = = maxLength) {sortbean.add (new childstring (length, Start));} else if (length > MaxLength) {sortbean.clear (); maxLength = Length;sortbean.add (new childstring (length, start));} Length = 0;start = j + 1;} if (i = = A-1 | | j = b-1) {//Direct add, save all substrings, the following inference, save only the current maximum substring//sortbean.add (new childstring (length, start)); if (length = = ma Xlength) {Sortbean.add (new childstring (length, start)),} else if (length > MaxLength) {sortbean.clear (); maxLength = Le Ngth;sortbean.add (new childstring (length, start));}}}} Male factor class Childstring {int Maxlength;int maxstart; childstring (int maxLength, int maxstart) {this.maxlength = Maxlength;this.maxstart = Maxstart;}} /** * @param args */puBlic static void Main (string[] args) {//TODO auto-generated method StubSystem.out.println (New Stringcompare (). Getmaxlengthcommonstring ("abcdef", "DEFABC"));}}


Java implementation of string matching problems two maximum common substrings of a string

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.