Java character transcoding UTF-8 to gbk/gb2312

Source: Internet
Author: User

Java is similar to Python, in Java, the encoding of the string is a Java modified Unicode encoding, so see the string in Java, psychology to meditate this thing is a Java modified Unicode encoding.

 Packagestring;ImportJava.nio.charset.Charset; Public classUTF82GBK { Public Static voidMain (string[] args)throwsException {//the default encoding for the system is GBKSystem.out.println ("Default charset=" +Charset.defaultcharset ()); String THFJKDS China China China China China China China China China China China China China hfsdkj fjldsajflkdsjaflkdsjalf sfdsfadas '; //idea: First to Unicode, then to GBKString UTF8 =NewString (T.getbytes ("UTF-8")); //equivalent to://String UTF8 = new String (T.getbytes ("UTF-8"), Charset.defaultcharset ());System.out.println (UTF8); String Unicode=NewString (Utf8.getbytes (), "UTF-8"); //equivalent to://String unicode = new String (Utf8.getbytes (Charset.defaultcharset ()), "UTF-8"); System.out.println (Unicode); String GBK=NewString (Unicode.getbytes ("GBK")); //equivalent to://string GBK = new String (Unicode.getbytes ("GBK"), Charset.defaultcharset ()); System.out.println (GBK); }}

 PackageCom.mkyong;ImportJava.io.BufferedReader;ImportJava.io.File;ImportJava.io.FileInputStream;ImportJava.io.InputStreamReader; Public classUTF8TOGBK { Public Static voidMain (string[] args)throwsException {File Filedir=NewFile ("/home/user/desktop/unsaved Document 1"); BufferedReader in=NewBufferedReader (NewInputStreamReader (NewFileInputStream (Filedir), "UTF-8"));         String str;  while(str = in.readline ())! =NULL) {System.out.println (str);//only Unicode encoding inside Java, so STR is Unicode encodedString str2 =NewString (Str.getbytes ("GBK"), "GBK");//str.getbytes ("GBK") is GBK encoded, but STR2 is Unicode encodedSystem.out.println (STR2);    } in.close (); }}

The point is that the new String (Xxx.getbytes ("GBK"), "GBK") is what this phrase means, xxx.getbytes ("GBK") gets the array encoding is GBK, so you must tell Java: The array I passed to you is GBK encoded, When you convert to your internal code, remember to do some processing, new string (Xxx.getbytes ("GBK"), "GBK"), the second "GBK" is to tell Java to pass it the GBK encoded string.

New String (Str.getbytes ("UTF-8"), "UTF-8"); // Normal New String (Str.getbytes ("UTF-8"), "GBK"); // not normal, Java built-in encoding->utf8  is converted to Java built -in encoding as GBK encoding

Take a look at what the JDK documentation says.

 Public String (byte[] bytes,      Charset Charset)

Constructs a new String by decoding the specified array of bytes using the specified charset.

So the question now is, how do I hold GBK encoded in a string?

New String (Str.getbytes ("GBK"), "iso-8859-1"); System.out.println (new String (str3.getbytes ("iso-8859-1"), "GBK"));

Java character transcoding UTF-8 to gbk/gb2312

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.