The daily walkthrough of the classic Algorithm question--the fifth question string similarity

Source: Internet
Author: User

The original: A daily walkthrough of the classic Algorithm--the fifth question string similarity

Let's look at another version of the longest common subsequence, find the string similarity (edit distance), I also said, this is a very practical algorithm, in the DNA contrast, the net

Page clustering and other aspects are useful.

One: Concept

For two strings A and B, change the string A to B through basic additions or deletions, or change B to a, the least steps we use in the process of change are called "editing distances".

For example, the following string: We through a variety of operations, after the seizure of the editing distance of 3, do not know you see it?

Second: Analysis

It may be a bit complicated and difficult to understand, we try to split the big question, "String vs string", break it down to "character vs string", then decompose

into "character vs. character".

<1> "character" vs "character"

This is the simplest case, such as the editing distance between "A" and "B" is obviously 1.

<2> "character" vs "string"

"A" to "AB" editing distance of 1, "a" and "ABA" editing distance of 2.

<3> "string" vs "string"

The editorial distance of "ABA" and "BBA" is 1, and we can draw a conclusion that "ABA" is a set of editing distances from 23 subsequence and "BBA" strings.

The minimum editing distance taken out in this case, which means that we have the problem of repeated computations, and I am seeking the editing distance of the subsequence "AB" and "BBA"

A minimum value is chosen for the editing distance between the subsequence "a" and "BBA" and "B" and "BBA", but I have already calculated the sequence A and sequence B earlier, and this repeated calculation

The problem is a bit like "Fibonacci", just to meet the "dynamic planning" in the optimal sub-structure and overlap sub-problem, so we decided to use dynamic programming to solve.

Three: Formula

As with the longest common subsequence, we use a two-dimensional array to hold the minimum editing distance for the current position of the string x and Y.

Existing two sequence X={x1,x2,x3,...xi},y={y1,y2,y3,....,yi},

Set a C[I,J]: Saves the current smallest LD of Xi and YJ.

①: When Xi = Yi, then c[i,j]=c[i-1,j-1];

②: when Xi! = Yi, then c[i,j]=min{c[i-1,j-1],c[i-1,j],c[i,j-1]};

Eventually our C[i,j] kept the smallest ld.

Four: Code

1 usingSystem;2 3 namespaceConsoleApplication24 {5      Public class Program6     {7         Static int[,] Martix;8 9         Static stringSTR1 =string. Empty;Ten  One         Static stringSTR2 =string. Empty; A  -         Static voidMain (string[] args) -         { the              while(true) -             { -STR1 =console.readline (); -  +STR2 =console.readline (); -  +Martix =New int[STR1. Length +1, str2. Length +1]; A  atConsole.WriteLine ("the editing distance for the string {0} and {1} is: {2}\n", str1, str2, LD ()); -             } -         } -  -         /// <summary> -         ///calculating the editing distance of a string in         /// </summary> -         /// <returns></returns> to          Public Static intLD () +         { -             //Initialize boundary values (ignoring boundary conditions at calculation) the              for(inti =0; I <= str1. Length; i++) *             { $Martix[i,0] =i;Panax Notoginseng             } -  the              for(intj =0; J <= str2. Length; J + +) +             { Amartix[0, j] =J; the             } +  -             //the X-coordinate of the matrix $              for(inti =1; I <= str1. Length; i++) $             { -                 //the Y-coordinate of the matrix -                  for(intj =1; J <= str2. Length; J + +) the                 { -                     //Equality CasesWuyi                     if(Str1[i-1] = = Str2[j-1]) the                     { -Martix[i, j] = Martix[i-1, J-1]; Wu                     } -                     Else About                     { $                         //take "left Front", "Top", and "left" minimum value -                         varTemp1 = Math.min (Martix[i-1, j], Martix[i, J-1]); -  -                         //Get Minimum value A                         varmin = Math.min (Temp1, Martix[i-1, J-1]); +  theMartix[i, J] = min +1; -                     } $                 } the             } the  the             //returns the editing distance of a string the             returnmartix[str1. Length, str2. Length]; -         } in     } the}

The daily walkthrough of the classic Algorithm question--the fifth question string similarity

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.