Unicode Encoding Rules
Unicode codes are represented by a 4-bit 16 binary number for each character. The specific rule is: the high 8 bits of a character (char) and the lower 8 bits are taken out respectively, converted to 16 binary number, if the conversion of 16 binary number of the length of less than 2, then 0, then the high, low 8-bit into the 16-character string together and in front to fill the "\u" can be.
Copyright NOTICE: This article for Bo Master original article, without Bo Master permission not reproduced. Original address: https://www.cnblogs.com/poterliu/p/9579918.html Contact e-mail:[email protected] Contact: Poterliu or sweep two-dimensional code |
Conversion Tool Implementation code:
/*** string-to-Unicode mutual conversion tool class *@authorPoterliu*/ Public classUnicodeconvertutil {/*** Turn string into Unicode *@paramstr waiting to be turned string *@returnUnicode string*/ Public Staticstring Convert (String str) {str= (str = =NULL? "": STR); String tmp; StringBuffer SB=NewStringBuffer (1000); CharC; intI, J; Sb.setlength (0); for(i = 0; i < str.length (); i++) {C=Str.charat (i); Sb.append ("\\u"); J= (c >>>8);//Remove the high 8-bitTMP =integer.tohexstring (j); if(tmp.length () = = 1) Sb.append ("0"); Sb.append (TMP); J= (c & 0xFF);//Remove the low 8-bitTMP =integer.tohexstring (j); if(tmp.length () = = 1) Sb.append ("0"); Sb.append (TMP); } return(NewString (SB)); } /*** Convert Unicode to String *@paramstr waiting to be turned string *@returnNormal String*/ Public Staticstring revert (String str) {str= (str = =NULL? "": STR); if(Str.indexof ("\\u") = =-1)//returns if it is not a Unicode code returnstr; StringBuffer SB=NewStringBuffer (1000); for(inti = 0; I < Str.length ()-6;) {String strtemp= Str.substring (i, i + 6); String value= Strtemp.substring (2); intc = 0; for(intj = 0; J < Value.length (); J + +) { CharTempchar =Value.charat (j); intt = 0; Switch(Tempchar) { CaseA: T= 10; Break; Case' B ': T= 11; Break; CaseC: T= 12; Break; Case' d ': T= 13; Break; CaseE: T= 14; Break; Case' F ': T= 15; Break; default: T= tempChar-48; Break; } C+ = T * ((int) Math.pow (Value.length ()-j-1))); } sb.append ((Char) c); I= i + 6; } returnsb.tostring (); }}
Reference:
60955807
End of full text
:)
The conversion tool class between strings and Unicode in Java