Favorite Algorithms (guys)-Levenshtein distance

Source: Internet
Author: User

String Matching:levenshtein Distance

    • Purpose:to use as little effort to convert one string into the other
    • Intuition behind the method:replacement, addition or deletion of a charcter in a string
    • Steps

Step

Description

1

Set N to is the length of S.

Set m to is the length of T.

If n = 0, return m and exit.

If m = 0, return n and exit.

Construct a matrix containing 0..m rows and 0..N columns.

2

Initialize the first row to 0..N.

Initialize the first column to 0..M.

3

Examine each character of s (i from 1 to n).

4

Examine each character of T (J from 1 to M).

5

If s[i] equals t[j], the cost is 0.

If s[i] doesn ' t equal t[j], the cost is 1.

6

Set Cell d[i,j] of the matrix equal to the minimum of:

A. The cell immediately above plus 1:d[i-1,j] + 1.

B. The cell immediately to the left plus 1:d[i,j-1] + 1.

C. The cell diagonally above and to the left plus the cost:d[i-1,j-1] + cost.

7

After the iteration steps (3, 4, 5, 6) was complete, the distance was found in cell d[n,m].

    • Example

This section shows how the Levenshtein distance was computed when the source string was "GUMBO" and the target string is "GA Mbol ".

Steps 1 and 2
G U M B O
0 1 2 3 4 5
G 1
A 2
M 3
B 4
O 5
L 6
Steps 3 to 6 when i = 1 /tr>
    G U M B O
  0 1 2 3 4 5
G 1 0        
A 2 1        
M 3 2        
B 4 3        
O 5 4        
L 6 5        
Steps 3 to 6 when i = 2
G U M B O
0 1 2 3 4 5
G 1 0 1
A 2 1 1
M 3 2 2
B 4 3 3
O 5 4 4
L 6 5 5
Steps 3 to 6 when i = 3
G U M B O
0 1 2 3 4 5
G 1 0 1 2
A 2 1 1 2
M 3 2 2 1
B 4 3 3 2
O 5 4 4 3
L 6 5 5 4
Steps 3 to 6 when i = 4
    G U M B O
  0 1 2 3 4 5
G 1 0 1 2 3  
A 2 1 1 2 3 &NB SP;
M 3 2 2 1 2  
B 4 3 3 2 1  
O 5 4 4 3 2  
L 6 5 5 4 3  
Steps 3 to 6 when i = 5
G U M B O
0 1 2 3 4 5
G 1 0 1 2 3 4
A 2 1 1 2 3 4
M 3 2 2 1 2 3
B 4 3 3 2 1 2
O 5 4 4 3 2 1
L 6 5 5 4 3 2
Step 7

The distance is in the lower right hand corner of the matrix, i.e. 2. This intuitive realization "GUMBO" can is transformed into "Gambol" by substituting "A" for "U" corresponds D adding "L" (one substitution and 1 insertion = 2 changes).

Favorite Algorithms (guys)-Levenshtein distance

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.