The minimum editing distance problem of dynamic programming

Source: Internet
Author: User

First said humorous digression, the morning classmate gave me a set of papers, let me do, he object of the machine questions, the first question is to find the team in the minimum distance point pair, did not say the amount of data, that must be directly violent on the line (have to points, the correct input to 5 points, what what to five points), or divide the algorithm (the beauty of programming); The second question is the replacement string The third problem is to ask for the area of the four-sided shape, forcing to think that this can not be used Helen Formula, because it may be concave quadrilateral, need vector formula. The topic is so simple, this is bupt, this topic can be examined out of level, hey, maybe really graduate school students level is not high, just will read the exam only.

Yesterday morning to the city, daughter-in-law bought me a head of the grass seeds, today soaked on, I hope it quickly grow up.

Dinner last night, undergraduate students surprised at my weight, find a clinic, go directly into the test, found that really is 70kg, undergraduate time I never to 60kg.

I. Source of the problem

Look at the cloud computing book, mentioned micro Bo to heavy, because the volume of data, need to hash and then mapreduce, then how to hash the similar micro-blog as far as possible in the same bucket? With which hash, the author put forward a local sensitive hash algorithm, the median distance measurement function can use the editing distance (edit Distance), but edit the right foot Levenshtein Levinstein distance, that is LD, write to this, I suddenly think, Ed Distance is not the editing distance, originally I think is Euclidean distance. Haha, this is really "article this day, hands occasionally."

I checked the data. The editing distance is widely used in NLP (Neuro-language programming).

Second, the problem analysis

The following information, from Peking University, this is really the cradle of talent, I do not know how I found this information, remember that this kind of information is also a PDF link, and now still. And the author found that their information is more detailed, the possible problems have been marked, these problems include the reader can think of and can not think of, very enlightening and alert role, why the alert, because you suddenly found their misunderstanding, then will not be so complacent; the difference between Peking University and our school It's like the difference between a high school teacher and a college teacher. This occasional humble opinion also.

1. Introduction

Source: She's a star with the theatre company.

Machine translation: She is a star with the theatre company.

She is the star of the troupe.

2. Algorithm analysis

After reading the next data, I found that the Dijkstra algorithm from the back to the results, programming is from the trip, but do not understand, now want to, it is inevitable. Because D[i][j]=min{d[i-1][j-1] ...}, the result is d[m][n] does not calculate the front of how the calculation of D[m][n], the former is recursive, from the back forward like recursion, haha.

Reference: Http://ccl.pku.edu.cn/doubtfire/Course/Computational%20Linguistics/contents/Minimum%20Edit%20Distance.pdf

Third, the implementation of the algorithm

Version 1.Java

The transformation of the above algorithm, or directly write a min (a,b,c) function is the above algorithm implementation.

public class Minimumeditdistance {public static int mineditdistance (string dest, string src) {int[][] f = n         EW int[dest.length () +1][src.length () + 1];         F[0][0] = 0;         for (int i = 1; i < dest.length () + 1; i) {f[i][0] = i;         } for (int i = 1; i < src.length () + 1; i) {f[0][i] = i;                 } for (int i = 1; i < dest.length () + 1, i) {for (int j = 1; J < Src.length () + 1; j) {                 The cost of the replacement int costs = 0;                 if (Dest.charat (i-1)! = Src.charat (j-1)) {cost = 1;                 } int mincost;                 if (F[i-1][j] < f[i][j-1]) {mincost = f[i-1][j] + 1;                 } else {mincost = f[i][j-1] + 1;                 } if (Mincost > F[i-1][j-1] + cost) {Mincost = f[i-1][j-1] + cost; }                 F[I][J] = Mincost;      }} return F[dest.length ()][src.length ()];     } public static void Main (string[] args) {System.out.println (mineditdistance ("Kindle", "Ainelw")); } }

2.c/c++ version

#include 
 
   
    
     #include 
  
    
     
      char s1[1000],s2[1000];   int min (int a,int b,int c) {   int t = a < b? a:b;   Return T < c? t:c;   }   void editdistance (int len1,int len2) {   int** d=new int*[len1+1];for (int k=0;k<=len1;k++) d[k]=new int[len2+1];< C12/>int i,j;   for (i = 0;i <= len1;i++)   d[i][0] = i;   for (j = 0;j <= len2;j++)   d[0][j] = j;   for (i = 1;i <= len1;i++) for   (j = 1;j <= len2;j++) {   int. cost = s1[i] = = S2[j]? 0:1;   int deletion = D[i-1][j] + 1;   int insertion = d[i][j-1] + 1;   int substitution = d[i-1][j-1] + cost;   D[i][j] = min (deletion,insertion,substitution);   }   printf ("%d\n", D[len1][len2]); for (int k=0;i<=len1;k++) delete[] d[k];d elete[] D;}   int main () {while   (scanf ("%s%s", s1,s2)! = EOF)   editdistance (strlen (S1), strlen (S2));   }
  
    
 
   

Minimum editing distance problem for dynamic planning

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.