Apache Commons Codec's language is a powerful package, mainly used in the processing of various languages, of course, this package of Chinese characters support is very bad. This piece of content, on the Internet very little, can only write some very superficial code, if there is a chance to contact, and then perfect.
Let's start by learning the code examples.
Examples of Soundex Coding |
Name Letters Coded Coding |
Allricht L, R, c A-462 |
Eberhard B, R, R E-166 |
Engebrethson N, G, b E-521 |
Heimbach m, B, C H-512 |
Hanselmann N, S, l H-524 |
Henzelmann N, Z, l H-524 |
Hildebrand L, D, b H-431 |
Kavanagh V, N, G K-152 |
Lind, Van N, D L-530 |
Lukaschowsky K, S, S L-222 |
McDonnell C, D, N M-235 |
McGee C M-200 |
O ' Brien B, R, N O-165 |
Opnian p, N, n O-155 |
Oppenheimer p, N, M O-155 |
Swhgler S, l, R S-460 |
Riedemanas D, M, N R-355 |
Zita T Z-300 |
Zitzmeinn T, Z, M Z-325 |
In some English use occasions, especially the voice, there are many possible similar words, using the method in this package to use:
Package Test.ffm83.commons.codec;
import Org.apache.commons.codec.language.RefinedSoundex;
import org.apache.commons.lang.StringUtils;
/**
* through Apache Commonscodec of the Lanauage the similarity difference of the characters in the package
* returned is the 0 to the shortest encoding length, 0 indicates irrelevant, 4 or it means it might be more similar.
* @author Fan Fangming
*/
Public class Easylanguagediff {
Private RefinedsoundexStringencoder = this. Createstringencoder ();
Public static void main (string[] args)throws exception{
Easylanguagediffdiff = neweasylanguagediff ();
Diff.getdifference ();
Diff.getencode ();
}
protected Refinedsoundex Createstringencoder () {
return new Refinedsoundex ();
}
Public Refinedsoundex Getstringencoder () {
return this. Stringencoder;
}
Public void getdifference () throws exception{
System. out. println (StringUtils. Center(" string difference ", ","-"));
System. out. println (this. Getstringencoder (). Difference (null,null));
System. out. println (this. Getstringencoder (). Difference ("",""));
System. out. println (this. Getstringencoder (). Difference ("",""));
System. out. println (this. Getstringencoder (). Difference ("Margaret","Andrew"));
System. out. println (getstringencoder (). Difference ("Smith","Smythe"));
System. out. println (this. Getstringencoder (). Difference ("Ann","Andrew"));
System. out. println (getstringencoder (). Difference ("Green","Greene"));
System. out. println (this. Getstringencoder (). Difference ("Smithers","Smythers")) ;
System. out. println ();
}
Public void Getencode () throws exception{
System. out. println (StringUtils. Center(" language code ", "-", "-"));
System. out. println (this. Getstringencoder (). Encode ("testing"));
System. out. println (this. Getstringencoder (). Encode ("testing"));
System. out. println (this. Getstringencoder (). Encode ("Dogs"));
System. out. println (Refinedsoundex. Us_english. Encode ("Dogs"));
// Chinese characters are not supported at the moment, so the direct exception is added
//system.out.println (This.getstringencoder (). Encode (" Fan Fangming "));
System. out. println ();
}
}
The results of the operation are as follows:
----------------------String Differences-----------------------
0
0
0
1
6
3
5
8
-----------------------language Encoding-----------------------
T6036084
T6036084
D6043
D6043
Apache Commons codec language