C # compares the similarity of two strings to "go"

Last Update:2018-03-21 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Original address: http://www.2cto.com/kf/201202/121170.html

Fuzzy search is often used when we are doing data systems, but the fuzzy search provided by the database does not have the ability to sort by relevance.

Now provides a way to compare two string similarity.
By calculating the similarity of two strings, it is possible to sort and filter the data in memory by LINQ, selecting one of the most similar results for the target string.

The similarity calculation formula used in this time is similarity degree =kq*q/(kq*q+kr*r+ks*s) (Kq > 0, kr>=0,ka>=0)
Where q is the total number of words that are present in string 1 and string 2, S is the total number of words that exist in string 1, not present in String 2, R is the total number of words that exist in string 2 and do not exist in string 1. Kq,kr and Ka are the weights of q,r,s respectively, according to the actual calculation, we set up Kq=2,kr=ks=1.
Based on this similarity calculation formula, the following program code is obtained:
<summary>
Get the similarity of two strings
</summary>
<param name= "Sourcestring" > First string </param>
<param name= "str" > Second string </param>
<returns></returns>
public static Decimal Getsimilaritywith (This string sourcestring, String str)
{

Decimal Kq = 2;
Decimal Kr = 1;
Decimal Ks = 1;

char[] ss = Sourcestring.tochararray ();
char[] st = str. ToCharArray ();

Get intersection quantity
int q = ss. Intersect (ST). Count ();
int s = ss. Length–q;
int r = St. Length–q;

return Kq * Q/(KQ * q + Kr * r + Ks * s);
}

This is the method of calculating the similarity of strings, but in practice, it is also necessary to take into account synonyms or synonyms, such as "the fastest-changing reading of love-making novels" and "the fastest-updated reading of love-making people". Two strings are, in a sense, the same, and if calculated using the above method, it will be inaccurate. So in practical applications, we need to replace synonyms or synonyms, and calculate the similarity after substitution.
If it is a synonym, we need to replace the results of the former and the synonyms, and get the actual similarity between the two strings.

C # compares the similarity of two strings to "go"

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

C # compares the similarity of two strings to "go"

Contact Us

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support