A question about the beauty of Programming
Many programs use strings in large quantities. For different strings, we hope to be able to determine their similar programs. We define a set of operation methods to make the two strings different from each other the same. The specific operation method is:
1. modify a character (for example, replace "A" with "B ");
2. Add a character (for example, change "abdd" to "aebdd ");
3. delete a character (for example, change "traveling" to "traveling ");
For example, for the "abcdefg" and "abcdef" strings, we think we can increase/decrease a "G" to achieve the goal. The preceding two solutions only need to be used once. The number of times required for this operation is defined as the distance between two strings, and the similarity is equal to the reciprocal of "distance + 1. That is to say, the distance between "abcdefg" and "abcdef" is 1, and the similarity is 1/2 = 0.5. Here, we only need to consider the string editing distance.
Analysis and Solution of the original text
It is not hard to see that the distance between two strings must not exceed the sum of their lengths (we can convert both strings into empty strings through the delete operation ). Although this conclusion does not help the result, we can at least know that the distance between any two strings is limited.
We still need to focus on how we can turn this problem into a smaller subproblem. If there are two strings a = xabcdae and B = xfdfa, their first character is the same, as long as a [2 ,..., 7] = abcdae and B [2 ,..., 5] = FDFA distance. However, if the first character of the two strings is different, you can perform the following operations (Lena and lenb are the length of string a and string B respectively ).
1. Delete the first character of string a, and then calculate the distance between a [2,..., Lena] and B [1,..., lenb.
2. Delete the first character of string B, and then calculate the distance between a [1,..., Lena] and B [2,..., lenb.
3. modify the first character of string a to the first character of string B, and then calculate a [2 ,..., lena] and B [2 ,..., lenb.
4. modify the first character of string B to the first character of string a, and then calculate a [2 ,..., lena] and B [2 ,..., lenb.
5. add the first character of string B before the first character of string a, and then calculate a [1 ,..., lena] and B [2 ,..., lenb.
6. add the first character of string a before the first character of string B, and then calculate a [2 ,..., lena] and B [1 ,..., lenb.
In this question, we do not care what the strings are after the two strings become equal. Therefore, you can merge the above six operations:
1. After one step, convert a [2,..., Lena] and B [1,..., lenb] into a phase string.
2. After one step, convert a [2,..., Lena] and B [2,..., lenb] into a phase string.
3. After one step, convert a [1,..., Lena] and B [2,..., lenb] into a phase string.
If you are familiar with dynamic planning, it is easy to see that it is best to use dynamic planning here. If you use recursion, many subproblems will be computed repeatedly.
#include <iostream>using namespace std;const int maxSize = 50;unsigned int dp[maxSize][maxSize];unsigned int dist(char* s1, int len1, char* s2, int len2){ if(!len1){ return len2; } if(!len2){ return len1; } for(int i = 0; s2[i]; ++i){ dp[0][i+1] = i+1; } for(int i = 0; s1[i]; ++i){ dp[i+1][0] = i+1; } unsigned int t; for(int i = 0; s1[i]; ++i){ for(int j = 0; s2[j]; ++j){ t = ~0; if(s1[i] == s2[j]){ t = dp[i][j]; } else{ if(t > dp[i][j+1]+1){ t = dp[i][j+1]+1; } if(t > dp[i+1][j]+1){ t = dp[i+1][j]+1; } } dp[i+1][j+1] = t; } } return dp[len1][len2]; }
See http://www.cnblogs.com/zhengyuhong/p/3645059.html