Java implementation of string matching problem for the largest common substring

Source: Internet
Author: User

Reprint Please specify source: http://blog.csdn.net/xiaojimanman/article/details/38924981

Recently in the project work there is a need for text contrast, after this period of learning, summed up this blog content: To find the largest common substring of two strings.

Algorithm idea: Calculates the common substring of two strings based on the graph. The specific algorithm thought refers to:



Input string S1:ACHMACMH input string S2:macham

1) Step A is to divide the string s1,s2 into a two-dimensional array, respectively, by byte;

2) The values in the two-dimensional array are shown in B, such as the value of the first column of the first row indicates whether the first byte of the string S2 and S1 are equal, if the equality is 1, otherwise it is 0, resulting in a two-dimensional array shown in B;

3) The common factor on the diagonal of the two-dimensional array (slash is the value of the lower-right corner of element A, that is, the next element of A[i][j] is a[i+1][j+1]; the public factor is a string of the position of the 1);

4) Sorts all public factors, returning the value of the largest common factor.


The specific implementation code is as follows:

Package Cn.lulei.compare;import Java.util.arraylist;import Java.util.collections;import java.util.Comparator; Import Java.util.list;public class Stringcompare {private int a;private int b;public String getmaxlengthcommonstring ( String s1, string s2) {if (S1 = = NULL | | s2 = = NULL) {return null;} A = S1.length ();//s1 length do line B = s2.length ()//s2 length do column if (a== 0 | | b = = 0) {return "";} Set the matching matrix Boolean [] array = new Boolean[a][b];for (int i = 0; i < A; i++) {Char C1 = S1.charat (i); for (int j = 0; J & Lt b  J + +) {Char C2 = S2.charat (j); if (c1 = = C2) {Array[i][j] = true;} else {Array[i][j] = false;}}} For all male factor strings, save the information as the starting position and length of the relative second string list<childstring> childstrings = new arraylist<childstring> (); for (int i = 0; i < A; i++) {getmaxsort (i, 0, array, childstrings);} for (int i = 1; i < b; i++) {getmaxsort (0, I, array, childstrings);} Sorting sort (childstrings), if (Childstrings.size () < 1) {return "";} Returns the maximum male factor string int max = Childstrings.get (0). MaxLength; StringBuffer sb = new StringBuffer(); for (childstring s:childstrings) {if (max! = s.maxlength) {break;} Sb.append (S2.substring (S.maxstart, S.maxstart + s.maxlength)); Sb.append ("\ n");} return sb.tostring ();} Sort, flashback private void sort (list<childstring> list) {Collections.sort (list, new comparator<childstring> () { public int Compare (childstring O1, childstring O2) {return o2.maxlength-o1.maxlength;}}); A common factor string on a slash private void getmaxsort (int i, int J, Boolean [] [] array, list<childstring> sortbean) {int length = 0 ; int start = J;for (; i < a && J < b; i++,j++) {if (Array[i][j]) {length++;} else {Sortbean.add (new childstr ing (length, start)); length = 0;start = j + 1;} if (i = = A-1 | | j = b-1) {sortbean.add (new childstring (length, start));}}} Male factor class Childstring {int Maxlength;int maxstart; childstring (int maxLength, int maxstart) {this.maxlength = Maxlength;this.maxstart = Maxstart;}} /** * @param args */public static void main (string[] args) {//TODO auto-generated method StubSystem.out.println (new Stringcompare (). getmaxlengthcommonstring ("ACHMACMH", "Macham"));}} 

The final result of the program execution is:


For more than two files The individual thinks it is possible to refer to the idea of the algorithm (which is now and for implementation), which will be written in a future blog.

Java implementation of string matching problem for the largest common substring

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.