PINYIN4J Project website Address http://pinyin4j.sourceforge.net/
We first download the resources, together with the source code and the jar package into the project. Such as:
Next, under the demo package, we write a test class that simply uses pinyin4j to sort the Chinese characters naturally.
Create a new Converttest.java
Packagedemo;Importjava.util.ArrayList;Importjava.util.Collections;ImportJava.util.Comparator;ImportJava.util.HashMap;Importjava.util.List;ImportJava.util.Map;ImportNet.sourceforge.pinyin4j.PinyinHelper; Public classConverttest { Public Static voidMain (string[] args) {String src= "There is a traitor among us."; Char[] arr =Src.tochararray (); System.out.println ("Array length is:" +arr.length); System.out.print ("Original order:"); for(CharTemp:arr) {System.out.print (temp+" "); } System.out.println (); Converttohanyupinyin (arr); } Private StaticList<string> Converttohanyupinyin (Char[] Array) {HashMap<string, string> map =NewHashmap<string, string>(); for(inti = 0; i < Array.Length; i++) { //get Pinyin initialsString value = (Pinyinhelper.tohanyupinyinstringarray (array[i])) [0].substring (0, 1); Map.put (String.valueof (Array[i]), value); } System.out.println (map); List<String> list =sort (map); returnlist; } Private StaticList<string>sort (map map) {List<map.entry<string, string>> infoids =NewArraylist<map.entry<string, string>>(Map.entryset ()); //to sort the value in HashMapCollections.sort (Infoids,NewComparator<map.entry<string, string>>() {@Override Public intCompare (Map.entry<string, string>O1, Map.entry<string, string>O2) { return(O1.getvalue ()). CompareTo (O2.getvalue ()); } }); List<String> list =NewArraylist<string>(); /*****************for test***********************/List<String> letterlist =NewArraylist<string>(); /*****************for test***********************/ //Sort the result after sorting value in HashMap for(inti = 0; I < infoids.size (); i++) {Map.entry<String,String> entry =Infoids.get (i); List.add (Entry.getkey ()); Letterlist.add (Entry.getvalue ()); } /*****************for test***********************/System.out.print ("Natural Order:"); for(String string:list) {System.out.print (string+ " "); } System.out.println (); System.out.print ("Alphabetical order:"); for(String string:letterlist) {System.out.print (string+" "); } /*****************for test***********************/ returnlist; }}
The output is:
You can see that the final output order is already sorted in natural order.
Simply say the steps:
1. We first convert the string sequence to a single character key, the first letter of value in the map form,
such as {a =g, acts =t, I =w, out =c, Rebel =p, =l, =z, =y, =j, =m}.
2. then sort the value in map and return the key value after sorting.
(PS: Of course, it is possible to sort the key values, but it is best to target value.)
Because we are here to intercept the initials, not the whole phonetic syllable. )
Code Disadvantages:
1. Just sort the first pinyin for Chinese characters, but there are polyphone in Mandarin.
2. Just sort the first letter of the character, not the whole phonetic byte, not rigorous, suitable for rough sort of scene.
The following is a simple analysis of the pinyin4j conversion process.
For example, the core of the class is pinyinhelper. It can convert many types of pinyin, here we only look at Hanyu Pinyin, the others are similar.
Tracking code Pinyinhelper.tohanyupinyinstringarray
Press CTRL + LEFT mouse button.
Static Public String[] Tohanyupinyinstringarray (char ch) { return Getunformattedhanyupinyinstringarray (CH); }
Continue tracking code
Private Static String[] Getunformattedhanyupinyinstringarray (char ch) { return Chinesetopinyinresource.getinstance (). Gethanyupinyinstringarray (ch); }
Call the Gethanyupinyinstringarray method of the Chinesetopinyinresource sample
String[] Gethanyupinyinstringarray (Charch) {String Pinyinrecord=Gethanyupinyinrecordfromchar (CH); if(NULL!=Pinyinrecord) {
Get the left parenthesis (the index valueintIndexofleftbracket =Pinyinrecord.indexof (Field.left_bracket);
Get the index value of the closing parenthesis)intIndexofrightbracket =Pinyinrecord.lastindexof (Field.right_bracket); Get the character corresponding to the phonetic String stripedstring=pinyinrecord.substring (Indexofleftbracket+Field.LEFT_BRACKET.length (), indexofrightbracket); Returns a string[] array as a comma separated by commas.returnStripedstring.split (Field.comma); } Else return NULL;//no record found or mal-formatted record}
The key method Gethanyupinyinrecordfromchar
PrivateString Gethanyupinyinrecordfromchar (Charch) { intCodepointofchar =ch; //convert to Unicode corresponding charactersString Codepointhexstr =integer.tohexstring (Codepointofchar). toUpperCase (); //querying characters from a table//fetch from HashtableString Foundrecord =getunicodetohanyupinyintable (). GetProperty (CODEPOINTHEXSTR); //returns if it is a valid character, otherwise returns null returnIsvalidrecord (Foundrecord)? Foundrecord:NULL; }
is the resource:
Simple analysis of PINYIN4J source code easy to use pinyin4j to sort Chinese characters naturally