Edit Distance, editdistance
Given two words word1 and word2, find the minimum number of steps required to convert word1 to word2. (each operation is counted as 1 step .)
You have the following 3 operations permitted on a word:
A) Insert a character
B) Delete a character
C) Replace a character
Suppose we want to change str1 to str2
Sstr1 (I) is a substring of str1, range [0 to I), and sstr1 (0) is an empty string
Sstr2 (j) is a substring of str2, same as above
D (I, j) indicates the distance from sstr1 (I) to sstr2 (j ).
First, d (0, t), 0 <= t <= str1.size () and d (k, 0) are obvious.
When we want to calculate d (I, j), that is, the editing distance between sstr1 (I) and sstr2 (j,
In this case, the form of sstr1 (I) is somestr1c; the form of sstr2 (I) is like somestr2d,
The editing distance from converting somestr1 to somestr2 is known to be d (I-1, J-1)
The editing distance from converting somestr1c to somestr2 is known to be d (I, J-1)
The editing distance from converting somestr1 to somestr2d is known to be d (I-1, j)
Then we can use these three variables to deliver d (I, j:
If c = d, apparently the editing distance is the same as d (I-1, J-1)
If c! = D. The situation is a little more complicated,
- If c is replaced with d, the edit distance is somestr1 to the edit distance of somestr2 + 1, that is, d (I-1, J-1) + 1
- If you add a word d after c, the editing distance should be somestr1c to somestr2's editing distance + 1, that is, d (I, J-1) + 1
- If c is deleted, it is to edit somestr1 to somestr2d, the distance is d (I-1, j) + 1
In the end, you only need to look at the three minimum, and then adopt the corresponding editing scheme.
The above algorithm analysis comes from the blog uniEagle
In the following implementation code, I use a rolling array instead of a two-dimensional array. Only one array is allocated.
Also, the variable dist_i1_j1 represents d (I-1, J-1)
Dist_ I _j indicates d (I, j)
Dist_i1_j represents d (I-1, j)
Dist_ I _j1 represents d (I, J-1)
In addition, in the following code
dist_i1_j , <pre name="code" class="cpp">dist_i_j
All of them can be saved.
However, it can be more intuitive.
The actual execution time on leetcode is 28 ms.
class Solution {public: int minDistance(string word1, string word2) { if (word1.size() < word2.size()) word1.swap(word2); if (!word2.size()) return word1.size(); vector<int> dist(word2.size()); for (int j=0; j<dist.size(); j++) dist[j] = j+1; for (int i=0; i<word1.size(); i++) { int dist_i1_j1 = i; int dist_i_j1 = dist_i1_j1 +1; for (int j=0; j<word2.size(); j++) { const int dist_i1_j = dist[j]; const int dist_i_j = word1[i] == word2[j] ? dist_i1_j1 : min(min(dist_i1_j1, dist_i1_j), dist_i_j1) + 1; dist_i_j1 = dist_i_j; dist_i1_j1 = dist[j]; dist[j] = dist_i_j; } } return dist.back(); }};