PINYIN4J Project website Address http://pinyin4j.sourceforge.net/
We first download the resources, together with the source code and the jar package into the project. Such as:
Next, under the demo package, we write a test class that simply uses pinyin4j to sort the Chinese characters naturally.
Create a new Converttest.java
PackageDemoImportJava.util.ArrayList;ImportJava.util.Collections;ImportJava.util.Comparator;ImportJava.util.HashMap;ImportJava.util.List;ImportJava.util.Map;ImportNet.sourceforge.pinyin4j.PinyinHelper;PublicClassconverttest {PublicStaticvoidMain (string[] args) {String src = "There's a traitor in our midst.";char[] arr =Src.tochararray (); SYSTEM.OUT.PRINTLN ("Array length is:" +Arr.length); System.out.print ("Original order:");for (CharTemp:arr) {System.out.print (temp+ ""); } System.out.println (); Converttohanyupinyin (arr); }PrivateStatic List<string> Converttohanyupinyin (Char[] array) {hashmap<string, string> map =New Hashmap<string, string>();for (int i = 0; i < Array.Length; i++) {//Get Pinyin initials String value = (Pinyinhelper.tohanyupinyinstringarray (array[i])) [0].substring (0, 1); Map.put (String.valueof (Array[i]), value); } System.out.println (map); list<string> list =Sort (map);ReturnList }PrivateStatic list<string>Sort (map map) {list<map.entry<string, string>> infoids =New Arraylist<map.entry<string, string>>(Map.entryset ());//Sorts the value in HashMap collections.sort (Infoids,New Comparator<map.entry<string, string>>() {@OverridePublicint compare (map.entry<string, string>O1, Map.entry<string, string>O2) {Return(O1.getvalue ()). CompareTo (O2.getvalue ()); } }); list<string> list =New arraylist<string>();/**For test***********************/List<string> letterlist =New arraylist<string>();/**For test***********************///Sort the result after sorting value in HashMapfor (int i = 0; I < infoids.size (); i++) {Map.entry<string,string> Entry =Infoids.get (i); List.add (Entry.getkey ()); Letterlist.add (Entry.getvalue ()); }/*****************for test**********************< Span style= "color: #008000;" >*/ System.out.print ("Natural order:" for); } System.out.println (); System.out.print ("Alphabetical order:" for);} /*****************for test**********************< Span style= "color: #008000;" >*/return list;}}
The output is:
You can see that the final output order is already sorted in natural order.
Simply say the steps:
1. We first convert the string sequence to a single character key, the first letter of value in the map form,
such as {a =g, acts =t, I =w, out =c, Rebel =p, =l, =z, =y, =j, =m}.
2. then sort the value in map and return the key value after sorting.
(PS: Of course, it is possible to sort the key values, but it is best to target value.)
Because we are here to intercept the initials, not the whole phonetic syllable. )
Code Disadvantages:
1. Just sort the first pinyin for Chinese characters, but there are polyphone in Mandarin.
2. Just sort the first letter of the character, not the whole phonetic byte, not rigorous, suitable for rough sort of scene.
The following is a simple analysis of the pinyin4j conversion process.
For example, the core of the class is pinyinhelper. It can convert many types of pinyin, here we only look at Hanyu Pinyin, the others are similar.
Tracking code Pinyinhelper.tohanyupinyinstringarray
Press CTRL + LEFT mouse button.
Public string[] Tohanyupinyinstringarray (char ch) { return Getunformattedhanyupinyinstringarray (CH); }
Continue tracking code
Static string[] Getunformattedhanyupinyinstringarray (char ch) { return Chinesetopinyinresource.getinstance (). Gethanyupinyinstringarray (CH); }
Call the Gethanyupinyinstringarray method of the Chinesetopinyinresource sample
String[] Gethanyupinyinstringarray (CharCH) {String Pinyinrecord =Gethanyupinyinrecordfromchar (CH);if (Null! = Pinyinrecord) {
//Get the index value of the opening parenthesis (int indexofleftbracket = Pinyinrecor D.indexof (Field.left_bracket); The index value of
//Get closing parenthesis) int indexofrightbracket = Pinyinrecor D.lastindexof (Field.right_bracket); Get the characters corresponding to the phonetic String stripedstring = pinyinrecord.substring (Indexofleftbracket + Field.LEFT_BRACKET.length (), indexofrightbracket); return string[] array with comma. return Stripedstring.split ( Field.comma); } else return null; Span style= "color: #008000;" >// No record found or mal-formatted record}
The key method Gethanyupinyinrecordfromchar
Private String Gethanyupinyinrecordfromchar (char ch) {int Codepoi Ntofchar = ch; // convert to Unicode corresponding character String codepointhexstr = Integer.tohexstring (Codepointofchar). toUpperCase (); // query characters from a table // fetch from Hashtable String foundrecord = Getunicodetohanyupinyintable (). GetProperty ( CODEPOINTHEXSTR); // If a valid character returns, otherwise null is returned. return Isvalidrecord (Foundrecord) Foundrecord: null< Span style= "color: #000000;" >; }
is the resource:
Http://www.cnblogs.com/sphere/p/4738888.html
Brief analysis of PINYIN4J source code simple use pinyin4j to sort Chinese characters naturally (go)