Reprinted in: http://www.cnblogs.com/biyeymyhjob/archive/2012/09/28/2707343.html
Edit Distance Concept Description:
The editing distance , also known as the Levenshtein distance , is the minimum number of edit operations required between two strings, from one to another. Permission edits include replacing one character with another character, inserting a character, and deleting a character.
For example, turn the kitten word into a sitting:
- Sitten (K→s)
- Sittin (E→i)
- Sitting (→G)
Russian scientist Vladimir Levenshtein introduced the concept in 1965.
problem: find the editing distance of the string, that is, a string s1 the minimum number of steps into a programming string S2, operation has three, add a character, delete a character, modify a character
Analytical:
First define such a function--edit (i, J), which represents the editing distance of the substring of the first string to the substring of the second string of length J.
Obviously, you can have the following dynamic programming formula:
- If i = = 0 and J = = 0,edit (i, j) = 0
- If i = = 0 and J > 0,edit (i, j) = J
- If i > 0 and j = = 0,edit (i, j) = I
- If I≥1 and j≥1, edit (i, j) = = min{Edit (i-1, J) + 1, edit (i, j-1) + 1, edit (i-1, j-1) + f (i, j)}, when the first character of a string is not equal to the second character The first J character of a string, F (i, j) = 1; otherwise, f (i, j) = 0.
|
0 |
F |
A |
I |
L |
I |
N |
G |
0 |
|
|
|
|
|
|
|
|
S |
|
|
|
|
|
|
|
|
A |
|
|
|
|
|
|
|
|
I |
|
|
|
|
|
|
|
|
L |
|
|
|
|
|
|
|
|
N |
|
|
|
|
|
|
|
|
|
0 |
F |
A |
I |
L |
I |
N |
G |
0 |
0 |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
S |
1 |
|
|
|
|
|
|
|
A |
2 |
|
|
|
|
|
|
|
I |
3 |
|
|
|
|
|
|
|
L |
4 |
|
|
|
|
|
|
|
N |
5 |
|
|
|
|
|
|
|
Calculation edit (1, 1), edit (0, 1) + 1 = = 2,edit (1, 0) + 1 = = 2,edit (0, 0) + F (1, 1) = = 0 + 1 = = 1,min (edit (0, 1), edit (1, 0), edit (0, 0) + F (1, 1)) ==1, so edit (1, 1) = = 1. In turn:
|
0 |
F |
A |
I |
L |
I |
N |
G |
0 |
0 |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
S |
1 |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
A |
2 |
2 |
|
|
|
|
|
|
I |
3 |
|
|
|
|
|
|
|
L |
4 |
|
|
|
|
|
|
|
N |
5 |
|
|
|
|
|
|
|
Edit (2, 1) + 1 = = 3,edit (1, 2) + 1 = = 3,edit (1, 1) + F (2, 2) = = 1 + 0 = 1, where s1[2] = = ' A ' and s2[1] = = ' F ', which are not the same, so the exchange of adjacent characters The calculation in the comparison minimum number is not counted. In this calculation, the final matrix is:
|
0 |
F |
A |
I |
L |
I |
N |
G |
0 |
0 |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
S |
1 |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
A |
2 |
2 |
1 |
2 |
3 |
4 |
5 |
6 |
I |
3 |
3 |
2 |
1 |
2 |
3 |
4 |
5 |
L |
4 |
4 |
3 |
2 |
1 |
2 |
3 |
4 |
N |
5 |
5 |
4 |
3 |
2 |
2 |
2 |
3 |
Program (c + +): Pay attention to two-dimensional array dynamic allocation and release Method!!
#include <iostream>#include <String>UsingNamespaceStdint min (int A,Intb) {Return a < b?A:B;}int edit (String str1,StringSTR2) {int max1 =Str1.size ();int MAX2 =Str2.size ();int **ptr =NewInt*[max1 +1];Forint i =0; I < MAX1 +1; i++) {Ptr[i] =NewINT[MAX2 +1]; }Forint i =0; i < MAX1 +1; i++) {ptr[i][0] =I }Forint i =0; i < MAX2 +1;i++) {ptr[0][i] =I }Forint i =1; i < MAX1 +1; i++) {ForInt J =1;j< max2 +1; J + +) {IntDint temp = min (ptr[i-1][J] +1, ptr[i][j-1] +1);if (str1[i-1] = = str2[j-1]) {d =0; }Else{d =1; } Ptr[i][j] = min (temp, ptr[i-1][j-1] +D); }} cout <<"**************************"<<EndlForint i =0; i < MAX1 +1; i++) {ForInt J =0; j< Max2 +1; J + +) {cout << ptr[i][j] <<""; } cout <<Endl } cout <<"**************************"<<Endlint dis =PTR[MAX1][MAX2];Forint i =0; I < MAX1 +1; i++) {delete[] ptr[i]; Ptr[i] =NULL; } delete[] ptr; PTR =NULL;Return dis;} int Main (voidstring str1 = "sailn" ; string str2 = "failing< Span style= "COLOR: #800000" > "; int r = edit (str1, str2); cout << " the dis is: "<< R << Endl; return 0
Execution effect:
"Reprint" editing distance and editing distance algorithm