Shortest editing Distance Algorithm

Source: Internet
Author: User

Generally, when a customer inputs a non-existent item, the e-commerce returns the customer's most similar item and prompts "are you looking for XXX ?". This uses an algorithm called the "shortest distance algorithm", which can find the string closest to the original string among a large number of existing strings. It is called the shortest distance for editing.

This algorithm is based on the idea of dynamic planning. The following describes the idea of the algorithm:

Description:

Set a and B to two strings. Use the minimum number of characters to convert string a to string B. The character operations mentioned here include:

(1) Delete a character;
(2) Insert a character;
(3) change one character to another.
The minimum number of characters used to convert string a to string B is the distance from string a to string B and is recorded as D (A, B ). Design an effective algorithm to calculate the editing distance between any two strings, A and B ).
Requirements:
Input: the first line is string a, and the second line is string B.
Output: The editing distance between string a and string B. D (A, B)

 

Ideas:

Open a two-dimensional array d [I] [J] to record the editing distance between the a0-ai AND THE b0-bj, you need to consider the overhead for the delete, insert, and replace operations on one of the strings, and find out a minimum overhead that is required

Specific algorithms:

First, specify the first row and the first column, and then calculate each value D [I, j] as follows: d [I] [J] = Min (d [I-1] [J] + 1, D [I] [J-1] + 1, D [I-1] [J-1] + (S1 [I]=S2 [J]? 0: 1 )); 
 In the last row, the value in the last column is the minimum editing distance.


This is the most primitive recursive algorithm:

int med(const string &x, int i, const string &y, int j) 11 { 12     count ++; 13     if(i == 0) 14         return j; 15     if(j == 0) 16         return i; 17     if(x[i-1] == y[j-1]) 18         return  med(x, i-1, y, j-1); 19     else{ 20         int a = med(x, i, y, j-1); 21         int b = med(x, i-1, y, j); 22         int c = med(x, i-1, y, j-1); 23         int temp; 24         return (((temp = (a < b ? a : b)) < c ? temp : c) + 1); 25     } 26 }        


This is a recursive algorithm with the backup function:

int med1(const string &x, int i, const string &y, int j) 29 { 30     if(backup[i][j] != -1) 31         return backup[i][j]; 32     count ++; 33     if(i == 0){ 34         backup[i][j] = j; 35         return j; 36     } 37     if(j == 0){ 38         backup[i][j] = i; 39         return i; 40     } 41     if(x[i-1] == y[j-1]){ 42         backup[i-1][j-1] = med1(x, i-1, y, j-1); 43         return backup[i-1][j-1]; 44     } 45     else{ 46         int a = med1(x, i, y, j-1) + 1; 47         int b = med1(x, i-1, y, j) + 1; 48         int c = med1(x, i-1, y, j-1) + 1; 49         int temp; 50         backup[i][j] = (temp = a < b ? a : b) < c ? temp : c; 51         return temp; 52     } 53 }


This non-recursive algorithm:

int med2(const string &x, int m, const string &y, int n) 56 { 57     for(int i = 0; i <= n; ++i) 58         backup[0][i] = i; 59     for(int i = 0; i <= m; ++i) 60         backup[i][0] = i; 61     for(int i = 1; i <= m; ++i){ 62         for(int j = 1; j <= n; ++j){ 63             if(x[i-1] == y[j-1]) 64                 backup[i][j] = backup[i-1][j-1]; 65             else{ 66                 int a = backup[i-1][j]; 67                 int b = backup[i][j-1]; 68                 int c = backup[i-1][j-1]; 69                 int temp; 70                 backup[i][j] = ((temp = a < b ? a : b) < c ? temp : c) + 1; 71             } 72         } 73     } 74     return backup[m][n]; 75 }


Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.