Algorithm Design background:
Recently, the design of knowledge management system resource Import function, in order to achieve as much as possible component, easy to expand, convenient for other modules to use. simplifies the interface provided by the components, and designs and implements the import framework based on the Mapping mechanism. One of the functions used to compute the two-string similarity algorithm, the simple design is as follows:
Design idea:
The two strings into the same basic operation are defined as follows:
1. Modify a character (e.g. change A to B)
2. Add one character (such as Abed into Abedd)
3. Delete one character (such as Jackbllog into Jackblog)
For Jackbllog to Jackblog only need to delete one or add one L can change two strings to be the same. The number of times required for this operation is defined as the distance L of two strings, then the similarity is defined as 1/(l+1), i.e. the reciprocal of the distance plus one. So the similarity between Jackbllog and Jackblog is 1/1+1=1/2=0.5 and the similarity of the two strings is 0.5, which means the two strings are already very close.
The distance of any two strings is limited and will not exceed the sum of their lengths, and we do not care what the two identical strings are like after a series of modifications. So it takes one step at a time, and the next calculation is recursive. The JAVA implementation is as follows:
1/** *//**
2 *
3 */
4package org.blogjava.arithmetic;
5
6import Java.util.HashMap;
7import Java.util.Map;
8
9/** *//**
Ten * @author Jack.wang
11 *
12 */
13public class Stringdistance {
14
public static final map<string, string> distance_cache = new hashmap<string, string> ();
16
The private static int caculatestringdistance (byte[] firststr, int firstbegin,
int firstend, byte[] secondstr, int secondbegin, int secondend) {
String key = Makekey (Firststr, Firstbegin, Secondstr, Secondbegin);
if (Distance_cache.get (key)!= null) {
Return Integer.parseint (Distance_cache.get (key));
} else {
if (Firstbegin >= firstend) {
if (Secondbegin >= secondend) {
0 return;
} else {
Secondend-secondbegin + 1;
28}
29}
if (Secondbegin >= secondend) {
if (Firstbegin >= firstend) {
0;
%} else {
return firstend-firstbegin + 1;
35}
36}
Panax Notoginseng if (firststr[firstbegin] = = Secondstr[secondbegin]) {
Return Caculatestringdistance (FIRSTSTR, Firstbegin + 1,
Firstend, Secondstr, Secondbegin + 1, secondend);
} else {
the int onevalue = Caculatestringdistance (firststr, Firstbegin + 1,
Firstend, Secondstr, Secondbegin + 2, secondend);
int twovalue = Caculatestringdistance (firststr, Firstbegin + 2,
Firstend, Secondstr, Secondbegin + 1, secondend);
the int threevalue = Caculatestringdistance (Firststr,
Firstbegin + 2, Firstend, SECONDSTR, Secondbegin + 2,
Secondend);
Distance_cache.put (Key, string.valueof (min onevalue, Twovalue,
(threevalue) + 1));
return min (Onevalue, Twovalue, threevalue) + 1;
51}
52}
53}
54
The public static float similarity (string stringone, String stringtwo) {
1f/Caculatestringdistance (Stringone.getbytes (), 0, Stringone
GetBytes () length-1, Stringtwo.getbytes (), 0, Stringone
GetBytes (). length-1) + 1);
59}
60
The private static int min (int onevalue, int twovalue, int threevalue) {
Onevalue > Twovalue? Twovalue
63:onevalue > Threevalue? Threevalue:onevalue;
64}
65
The private static String Makekey (byte[] firststr, int firstbegin,
Byte[] secondstr, int secondbegin) {
StringBuffer sb = new StringBuffer ();
Sb.append return (FIRSTSTR). Append (Firstbegin). Append (Secondstr). Append (
Secondbegin). toString ();
71}
72
73/** *//**
args * @param
75 */
string[public static void Main (] args) {
The Float i = stringdistance.similarity ("Jacklovvedyou", "jacklodveyou");
System.out.println (i);
79}
80}
81