When we use search features such as Google, there are candidates for search content. The use of JavaScript to search strings, usually using indexOf
or search
function, but very stiff, can only search match specific words. For example, the use of keywords 今天是星期几
to retrieve 今天是星期五
this content, it can not be achieved, although they are only a small difference.
This article introduces an interesting algorithm editing distance (Levenshtein Distance), and then uses it to implement a simple candidate recommendation (Fuzzy search) function.
Editing distance (Levenshtein Distance)
To put it simply, editing distance is changing the number of times a string is changed into another string. If the number of modifications is smaller, we can simply assume that the relationship between the two strings is closer. 今天是星期几
For example 今天是星期五
明天是星期五
, and comparison, and 今天是星期五
more closely, because the former editing distance is 1, the latter of the editing distance is 2.
More detailed Baidu Encyclopedia has already said very clearly, here no longer repeat, mainly gives the JavaScript implementation method:
According to the algorithm of natural language expression, we need to create a two-dimensional table based on the length of two strings:
function Levenshtein (A, b) { var al = a.length + 1; var bl = b.length + 1; var result = []; var temp = 0; // Create a two-dimensional array for (var i = 0; i < al; Result[i] = [i++]) {} for (var i = 0; i < bl; Result[0][i] = i++) {}}
After that, we need to traverse the two-bit array and get the minimum value of three values according to the following rules:
- If the topmost character equals the leftmost character, the number at the top left. Otherwise, the number above the top left is + 1.
- Left digit + 1
- Top number + 1
You need to determine if two values are equal to determine if the upper left number is + 1, so introduce the TEMP variable. We can write the following traversal code:
for (i = 1; i < Al; i++) { for (var j = 1; j < BL; j + +) { // judgment The top and left numbers are equal temp = a[i-1] = = b[j-1]? 0:1; // Result[i-1][j] + 1 left digit // Result[i][j-1] + 1 digits above // Result[i-1][j-1] + temp upper left number RESULT[I][J] = Math.min (Result[i-1][j] + 1, result[i][j-1] + 1, result[i-1][j-1] + temp);} }
Finally, the last value of the two-dimensional array is returned, which is the editing distance:
return result[i-1][j-1];
This function is done:
functionLevenshtein (A, b) {varAl = a.length + 1; varBL = b.length + 1; varresult = []; vartemp = 0; //Create a two-dimensional array for(vari = 0; I < Al; Result[i] = [i++]) {} for(vari = 0; i < BL; Result[0][i] = i++) {} for(i = 1; i < Al; i++) { for(varj = 1; J < BL; J + +) { //determine if the top and left numbers are equaltemp = a[i-1] = = B[j-1]? 0:1; //Result[i-1][j] + 1 left digit //Result[i][j-1] + 1 digits above //Result[i-1][j-1] + temp upper left numberRESULT[I][J] = Math.min (Result[i-1][j] + 1, result[i][j-1] + 1, result[i-1][j-1] +temp); } } returnResult[i-1][j-1]; }
Practical application
So let's implement a simple search function now.
Original address: http://www.yujiangshui.com/javascript-levenshtein-distance/
Using JavaScript to implement simple candidate recommendation features (fuzzy search) "collection" "Go"