[Algorithm] string editing distance

Last Update:2014-07-01 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

A question about the beauty of Programming

Many programs use strings in large quantities. For different strings, we hope to be able to determine their similar programs. We define a set of operation methods to make the two strings different from each other the same. The specific operation method is:

1. modify a character (for example, replace "A" with "B ");

2. Add a character (for example, change "abdd" to "aebdd ");

3. delete a character (for example, change "traveling" to "traveling ");

For example, for the "abcdefg" and "abcdef" strings, we think we can increase/decrease a "G" to achieve the goal. The preceding two solutions only need to be used once. The number of times required for this operation is defined as the distance between two strings, and the similarity is equal to the reciprocal of "distance + 1. That is to say, the distance between "abcdefg" and "abcdef" is 1, and the similarity is 1/2 = 0.5. Here, we only need to consider the string editing distance.

Analysis and Solution of the original text

It is not hard to see that the distance between two strings must not exceed the sum of their lengths (we can convert both strings into empty strings through the delete operation ). Although this conclusion does not help the result, we can at least know that the distance between any two strings is limited.

We still need to focus on how we can turn this problem into a smaller subproblem. If there are two strings a = xabcdae and B = xfdfa, their first character is the same, as long as a [2 ,..., 7] = abcdae and B [2 ,..., 5] = FDFA distance. However, if the first character of the two strings is different, you can perform the following operations (Lena and lenb are the length of string a and string B respectively ).

1. Delete the first character of string a, and then calculate the distance between a [2,..., Lena] and B [1,..., lenb.

2. Delete the first character of string B, and then calculate the distance between a [1,..., Lena] and B [2,..., lenb.

3. modify the first character of string a to the first character of string B, and then calculate a [2 ,..., lena] and B [2 ,..., lenb.

4. modify the first character of string B to the first character of string a, and then calculate a [2 ,..., lena] and B [2 ,..., lenb.

5. add the first character of string B before the first character of string a, and then calculate a [1 ,..., lena] and B [2 ,..., lenb.

6. add the first character of string a before the first character of string B, and then calculate a [2 ,..., lena] and B [1 ,..., lenb.

In this question, we do not care what the strings are after the two strings become equal. Therefore, you can merge the above six operations:

1. After one step, convert a [2,..., Lena] and B [1,..., lenb] into a phase string.

2. After one step, convert a [2,..., Lena] and B [2,..., lenb] into a phase string.

3. After one step, convert a [1,..., Lena] and B [2,..., lenb] into a phase string.

If you are familiar with dynamic planning, it is easy to see that it is best to use dynamic planning here. If you use recursion, many subproblems will be computed repeatedly.

#include <iostream>using namespace std;const int maxSize = 50;unsigned int dp[maxSize][maxSize];unsigned int dist(char* s1, int len1, char* s2, int len2){    if(!len1){        return len2;    }    if(!len2){        return len1;    }    for(int i = 0; s2[i]; ++i){        dp[0][i+1] = i+1;    }    for(int i = 0; s1[i]; ++i){        dp[i+1][0] = i+1;    }    unsigned int t;    for(int i = 0; s1[i]; ++i){        for(int j = 0; s2[j]; ++j){            t = ~0;            if(s1[i] == s2[j]){                t = dp[i][j];            }            else{                if(t > dp[i][j+1]+1){                    t = dp[i][j+1]+1;                }                if(t > dp[i+1][j]+1){                    t = dp[i+1][j]+1;                }            }            dp[i+1][j+1] = t;        }    }    return dp[len1][len2];    }

See http://www.cnblogs.com/zhengyuhong/p/3645059.html

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

[Algorithm] string editing distance

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

[Algorithm] string editing distance

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support