Shortest editing Distance Algorithm
Generally, when a customer inputs a non-existent item, the e-commerce returns the customer's most similar item and prompts "are you looking for XXX ?". This uses an algorithm called the "shortest distance algorithm", which can find the string closest to the original string among a large number of existing strings. It is called the shortest distance for editing.
This algorithm is based on the idea of dynamic planning. The following describes the idea of the algorithm:
Description:
Set A and B to two strings. Use the minimum number of characters to convert string A to string B. The character operations mentioned here include:
(1) Delete a character;
(2) Insert a character;
(3) change one character to another.
The minimum number of characters used to convert string A to string B is the distance from string A to string B and is recorded as d (A, B ). Design an effective algorithm to calculate the editing distance between any two strings, A and B ).
Requirements:
Input: the first line is string A, and the second line is string B.
Output: The editing distance between string A and string B. d (A, B)
Ideas:
Open a two-dimensional array d [I] [j] to record the editing distance between the a0-ai AND THE b0-bj, you need to consider the overhead for the delete, insert, and replace operations on one of the strings, and find out a minimum overhead that is required
Specific algorithms:
First, specify the first row and the first column, and then calculate each value d [I, j] as follows: d [I] [j] = Min (d [I-1] [j] + 1, d [I] [J-1] + 1, d [I-1] [J-1] + (s1 [I]=S2 [j]? 0: 1 ));
In the last row, the value in the last column is the minimum editing distance.
This is the most primitive recursive algorithm:
int med(const string &x, int i, const string &y, int j) 11 { 12 count ++; 13 if(i == 0) 14 return j; 15 if(j == 0) 16 return i; 17 if(x[i-1] == y[j-1]) 18 return med(x, i-1, y, j-1); 19 else{ 20 int a = med(x, i, y, j-1); 21 int b = med(x, i-1, y, j); 22 int c = med(x, i-1, y, j-1); 23 int temp; 24 return (((temp = (a < b ? a : b)) < c ? temp : c) + 1); 25 } 26 }
This is a recursive algorithm with the backup function:
int med1(const string &x, int i, const string &y, int j) 29 { 30 if(backup[i][j] != -1) 31 return backup[i][j]; 32 count ++; 33 if(i == 0){ 34 backup[i][j] = j; 35 return j; 36 } 37 if(j == 0){ 38 backup[i][j] = i; 39 return i; 40 } 41 if(x[i-1] == y[j-1]){ 42 backup[i-1][j-1] = med1(x, i-1, y, j-1); 43 return backup[i-1][j-1]; 44 } 45 else{ 46 int a = med1(x, i, y, j-1) + 1; 47 int b = med1(x, i-1, y, j) + 1; 48 int c = med1(x, i-1, y, j-1) + 1; 49 int temp; 50 backup[i][j] = (temp = a < b ? a : b) < c ? temp : c; 51 return temp; 52 } 53 }
This non-recursive algorithm:
int med2(const string &x, int m, const string &y, int n) 56 { 57 for(int i = 0; i <= n; ++i) 58 backup[0][i] = i; 59 for(int i = 0; i <= m; ++i) 60 backup[i][0] = i; 61 for(int i = 1; i <= m; ++i){ 62 for(int j = 1; j <= n; ++j){ 63 if(x[i-1] == y[j-1]) 64 backup[i][j] = backup[i-1][j-1]; 65 else{ 66 int a = backup[i-1][j]; 67 int b = backup[i][j-1]; 68 int c = backup[i-1][j-1]; 69 int temp; 70 backup[i][j] = ((temp = a < b ? a : b) < c ? temp : c) + 1; 71 } 72 } 73 } 74 return backup[m][n]; 75 }