Import java.io.unsupportedencodingexception;/** * Convert string encoding */public class Changecharset {/** 7-bit ASCII character, also known as iso646-us, Un Basic Latin block of icode character set */public static final String us_ascii = "Us-ascii"; /** ISO Latin alphabet, also known as iso-latin-1 */public static final String iso_8859_1 = "Iso-8859-1"; /** 8-bit UCS conversion format */public static final String utf_8 = "UTF-8"; /** 16-bit UCS conversion format, Big Endian (lowest address holds high byte) byte order */public static final String utf_16be = "Utf-16be"; /** 16-bit UCS conversion format, Little-endian (highest address holds low byte) byte order */public static final String Utf_16le = "Utf-16le"; /** 16-bit UCS conversion format, byte order is identified by an optional byte order mark */public static final String utf_16 = "UTF-16"; /** Chinese Large Character set */public static final String GBK = "GBK"; /** * Convert character encoding to US-ASCII code */Public String toascii (String str) throws unsupportedencodingexception{return This.changecha RSet (str, US_ASCII); /** * Converts character encoding to iso-8859-1 code */Public String toiso_8859_1 (String str) throws unsupportedencodingexception{return. Changecharset (str, iso_8859_1); }/** * converts character encoding to uTF-8 Code */Public String toutf_8 (String str) throws unsupportedencodingexception{return This.changecharset (str, utf_8);} /** * Convert character encoding to UTF-16BE code */Public String toutf_16be (String str) throws unsupportedencodingexception{return This.chang Echarset (str, UTF_16BE); /** * Convert character encoding to UTF-16LE code */Public String toutf_16le (String str) throws unsupportedencodingexception{return This.chan Gecharset (str, utf_16le); /** * Convert character encoding to UTF-16 code */Public String toutf_16 (String str) throws unsupportedencodingexception{return This.changech Arset (str, utf_16); /** * Convert character encoding to GBK code */Public String TOGBK (String str) throws unsupportedencodingexception{return This.changecharset ( STR, GBK); /** * String Encoding Conversion implementation method * @param STR to convert the encoded string * @param newcharset target encoding * @return * @throws Unsupportedencodingexcepti On */Public String Changecharset (String str, string newcharset) throws Unsupportedencodingexception {if (str! = NULL ) {//decodes the string with the default character encoding. byte[] bs = Str.getbytes (); With the new character encodinginto a string return new string (BS, newcharset); } return null; }/** * String encoding Conversion implementation method * @param STR to convert the encoded string * @param oldcharset Original code * @param newcharset Target code * @return * @throws Un Supportedencodingexception */Public String Changecharset (String str, string oldcharset, String newcharset) throws Unsu pportedencodingexception {if (str! = NULL) {//decodes the string with the old character encoding. Decoding may occur with exceptions. byte[] bs = str.getbytes (Oldcharset); Generates a string with the new character encoding return new string (BS, newcharset); } return null; public static void Main (string[] args) throws unsupportedencodingexception {changecharset test = new Changecharset (); String str = "This is a Chinese string!"; System.out.println ("str:" + str); String GBK = TEST.TOGBK (str); System.out.println ("converted into GBK code:" + GBK); System.out.println (); String ASCII = test.toascii (str); System.out.println ("Converted to Us-ascii code:" + ASCII); GBK = Test.changecharset (Ascii,changecharset.us_ascii, CHANGECHARSET.GBK); System.out.println ("then convert the ASCII code string into a GBK code:" + GBK); System.ouT.println (); String iso88591 = test.toiso_8859_1 (str); System.out.println ("converted into Iso-8859-1 code:" + iso88591); GBK = Test.changecharset (iso88591,changecharset.iso_8859_1, CHANGECHARSET.GBK); System.out.println ("then convert the Iso-8859-1 code string into GBK code:" + GBK); System.out.println (); String UTF8 = test.toutf_8 (str); System.out.println ("converted into UTF-8 code:" + UTF8); GBK = Test.changecharset (Utf8,changecharset.utf_8, CHANGECHARSET.GBK); System.out.println ("then convert the UTF-8 code string into GBK code:" + GBK); System.out.println (); String utf16be = test.toutf_16be (str); System.out.println ("converted into Utf-16be code:" + utf16be); GBK = Test.changecharset (Utf16be,changecharset.utf_16be, CHANGECHARSET.GBK); System.out.println ("then convert the Utf-16be code string into GBK code:" + GBK); System.out.println (); String utf16le = Test.toutf_16le (str); System.out.println ("converted into Utf-16le code:" + Utf16le); GBK = Test.changecharset (Utf16le,changecharset.utf_16le, CHANGECHARSET.GBK); System.out.println ("then convert the Utf-16le code string into GBK code:" + GBK); System.out.println (); String UTF16 = tesT.toutf_16 (str); System.out.println ("converted into UTF-16 code:" + UTF16); GBK = Test.changecharset (Utf16,changecharset.utf_16le, CHANGECHARSET.GBK); System.out.println ("then convert the UTF-16 code string into GBK code:" + GBK); string s = new string ("Chinese". GetBytes ("UTF-8"), "UTF-8"); System.out.println (s); }}
--------------------------------------------------------------------------------------------------------------- ---
The string class in Java is encoded in Unicode, and when a string is constructed using string (byte[] bytes, string encoding), encoding refers to the data in bytes encoded in that way, Instead of the last generated string, what is encoded, in other words, is to have the system convert the data in bytes from encoding encoding to Unicode encoding. If not specified, the bytes encoding will be determined by the JDK based on the operating system.
When we read data from a file, it is best to use the InputStream method and then use String (byte[] bytes, string encoding) to indicate how the file is encoded. Do not use reader mode, because the reader method automatically converts the file content to Unicode encoding based on the encoding specified by the JDK.
When we read the text data from the database, we use the Resultset.getbytes () method to get the byte array, and also adopt the string construction method with encoding.
ResultSet rs;
bytep[] bytes = Rs.getbytes ();
String str = new string (bytes, "gb2312");
Do not take the following steps.
ResultSet rs;
String str = rs.getstring ();
str = new String (str.getbytes ("iso8859-1"), "gb2312");
This type of encoding translates into an efficient bottom. The reason for this is that the data in the default database is encoded as iso8859-1 when ResultSet is executed in the GetString () method. The system converts the data into Unicode according to the Iso8859-1 encoding method. Use Str.getbytes ("iso8859-1") to restore the data and then use the new String (bytes, "gb2312") to convert the data from gb2312 to Unicode, with a lot more steps in between.
When reading parameters from HttpRequest, the Reqeust.setcharacterencoding () method is used to set the encoding method, and the content read is correct.
Copyright notice: I feel like I'm doing a good job. I hope you can move your mouse and keyboard for me to order a praise or give me a comment, under the Grateful!_____________________________________________________ __ Welcome reprint, in the hope that you reprint at the same time, add the original address, thank you with
Various encoding conversions for Java strings