A _php tutorial on the function of calculating string similarity in PHP

Source: Internet
Author: User
Tags first string
In PHP to calculate the string similarity similar_text and similarity Levenshtein function, the following is a detailed introduction to the introduction of string similarity.

similar_text-calculating the similarity of two strings
int Similar_text (string $first, String $second [, float & $percent])
$first required. Specifies the first string to compare.
$second required. Specifies a second string to compare.
$percent is optional. A variable name that specifies the similarity of the storage percentage.

The similarity of two strings is calculated according to the description of Oliver [1993]. Note that the implementation does not use the stack in the Oliver virtual code, but it makes recursive calls, which can cause the whole process to slow down or become faster. Note also that the complexity of the algorithm is O (n**3), and N is the length of the longest string.

For example, we want to find the similarity of string ABCDEFG and the string AEG:

The code is as follows Copy Code

$first = "ABCDEFG";
$second = "AEG";

Echo Similar_text ($first, $second); result output 3. If you want to display as a percentage, you can use its third parameter, as follows:

$first = "ABCDEFG";
$second = "AEG";

Similar_text ($first, $second, $percent);
Echo $percent;

The use and implementation process of Similar_text function. The Similar_text () function is used primarily to calculate the number of matching characters for two strings, or to calculate the similarity of two strings (in percent). The Levenshtein () function we are going to introduce today is faster than the Similar_text () function. However, the Similar_text () function can provide more accurate results with fewer required modifications. Consider using the Levenshtein () function when the speed is less accurate and the string lengths are limited.

Instructions for use
First look at the description of the Levenshtein () function on the manual:

The Levenshtein () function returns the Levenshtein distance between two strings.

Levenshtein distance, also known as the editing distance, refers to the minimum number of edit operations required between two strings, converted from one to another. Permission edits include replacing one character with another character, inserting a character, and deleting a character.

For example, convert kitten to sitting:

Sitten (K→s)
Sittin (E→i)
The sitting (→g) Levenshtein () function gives the same weight for each operation (replace, insert, and delete). However, you can define the cost of each operation by setting the optional Insert, replace, and delete parameters.

Grammar:

Levenshtein (String1,string2,insert,replace,delete)

Parameter description

string1 required. The first string to compare.
string2 required. The second string to compare.
Insert is optional. The cost of inserting a character. The default is 1.
Replace is optional. The cost of replacing a character. The default is 1.
Delete is optional. The cost of deleting a character. The default is 1.
Hints and Notes

• If one of the strings exceeds 255 characters, the Levenshtein () function returns-1.
The Levenshtein () function is not case sensitive.
The Levenshtein () function is faster than the Similar_text () function. However, the Similar_text () function provides more precise results that require less modification.
Example

The code is as follows Copy Code

Echo Levenshtein ("Hello World", "Ello World");
echo "
";
Echo Levenshtein ("Hello World", "Ello World", 10,20,30);
?>

Output: 1 30


http://www.bkjia.com/PHPjc/445299.html www.bkjia.com true http://www.bkjia.com/PHPjc/445299.html techarticle in PHP to calculate the string similarity similar_text and similarity Levenshtein function, the following is a detailed introduction to the introduction of string similarity. Similar_text calculates two characters ...

  • Contact Us

    The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

    If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

    A Free Trial That Lets You Build Big!

    Start building with 50+ products and up to 12 months usage for Elastic Compute Service

    • Sales Support

      1 on 1 presale consultation

    • After-Sales Support

      24/7 Technical Support 6 Free Tickets per Quarter Faster Response

    • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.