Various encoding conversions for "reprint" Java Strings

Source: Internet
Author: User

Various encoding conversions for Java strings

From: http://www.blogjava.net/rabbit/archive/2008/03/27/189009.html

Import java.io.UnsupportedEncodingException;

/**
* Encoding of converted Strings
*/
public class Changecharset {
/** 7-bit ASCII character, also known as the basic Latin block of the iso646-us, Unicode character set */
public static final String us_ascii = "Us-ascii";

/** ISO Latin alphabet, also known as iso-latin-1 */
public static final String iso_8859_1 = "Iso-8859-1";

/** 8-bit UCS conversion format */
public static final String utf_8 = "UTF-8";

/** 16-bit UCS conversion format, Big Endian (lowest address holds high byte) byte order */
public static final String utf_16be = "Utf-16be";

/** 16-bit UCS conversion format, Little-endian (highest address holds low byte) byte order */
public static final String Utf_16le = "Utf-16le";

/** 16-bit UCS conversion format, byte order identified by an optional byte order mark */
public static final String utf_16 = "UTF-16";

/** Chinese Super Large character set */
public static final String GBK = "GBK";

/**
* Convert character encoding to US-ASCII code
*/
public string Toascii (String str) throws unsupportedencodingexception{
return This.changecharset (str, US_ASCII);
}
/**
* Convert character encoding to iso-8859-1 code
*/
public string toiso_8859_1 (String str) throws unsupportedencodingexception{
return This.changecharset (str, iso_8859_1);
}
/**
* Convert character encoding to UTF-8 code
*/
public string Toutf_8 (String str) throws unsupportedencodingexception{
return This.changecharset (str, utf_8);
}
/**
* Convert character encoding to UTF-16BE code
*/
public string Toutf_16be (String str) throws unsupportedencodingexception{
return This.changecharset (str, UTF_16BE);
}
/**
* Convert character encoding to Utf-16le code
*/
public string Toutf_16le (String str) throws unsupportedencodingexception{
return This.changecharset (str, utf_16le);
}
/**
* Convert character encoding to UTF-16 code
*/
public string toutf_16 (String str) throws unsupportedencodingexception{
return This.changecharset (str, utf_16);
}
/**
* Convert character encoding to GBK code
*/
public string TOGBK (String str) throws unsupportedencodingexception{
return This.changecharset (str, GBK);
}

/**
* Implementation method of string encoding conversion
* @param STR to convert the encoded string
* @param newcharset Target Code
* @return
* @throws unsupportedencodingexception
*/
public string Changecharset (String str, string newcharset)
Throws Unsupportedencodingexception {
if (str! = null) {
Decodes a string with the default character encoding.
byte[] bs = Str.getbytes ();
Generate a string with a new character encoding
return new String (BS, Newcharset);
}
return null;
}
/**
* Implementation method of string encoding conversion
* @param STR to convert the encoded string
* @param oldcharset Original code
* @param newcharset Target Code
* @return
* @throws unsupportedencodingexception
*/
public string Changecharset (String str, string oldcharset, String newcharset)
Throws Unsupportedencodingexception {
if (str! = null) {
Decodes the string with the old character encoding. Decoding may occur with exceptions.
byte[] bs = str.getbytes (Oldcharset);
Generate a string with a new character encoding
return new String (BS, Newcharset);
}
return null;
}

public static void Main (string[] args) throws Unsupportedencodingexception {
Changecharset test = new Changecharset ();
String str = "This is a Chinese string!";
System.out.println ("str:" + str);
String GBK = TEST.TOGBK (str);
System.out.println ("converted into GBK code:" + GBK);
System.out.println ();
String ASCII = test.toascii (str);
System.out.println ("Converted to Us-ascii code:" + ASCII);
GBK = Test.changecharset (Ascii,changecharset.us_ascii, CHANGECHARSET.GBK);
System.out.println ("then convert the ASCII code string into a GBK code:" + GBK);
System.out.println ();
String iso88591 = test.toiso_8859_1 (str);
System.out.println ("converted into Iso-8859-1 code:" + iso88591);
GBK = Test.changecharset (iso88591,changecharset.iso_8859_1, CHANGECHARSET.GBK);
System.out.println ("then convert the Iso-8859-1 code string into GBK code:" + GBK);
System.out.println ();
String UTF8 = test.toutf_8 (str);
System.out.println ("converted into UTF-8 code:" + UTF8);
GBK = Test.changecharset (Utf8,changecharset.utf_8, CHANGECHARSET.GBK);
System.out.println ("then convert the UTF-8 code string into GBK code:" + GBK);
System.out.println ();
String utf16be = test.toutf_16be (str);
System.out.println ("converted into Utf-16be code:" + utf16be);
GBK = Test.changecharset (Utf16be,changecharset.utf_16be, CHANGECHARSET.GBK);
System.out.println ("then convert the Utf-16be code string into GBK code:" + GBK);
System.out.println ();
String utf16le = Test.toutf_16le (str);
System.out.println ("converted into Utf-16le code:" + Utf16le);
GBK = Test.changecharset (Utf16le,changecharset.utf_16le, CHANGECHARSET.GBK);
System.out.println ("then convert the Utf-16le code string into GBK code:" + GBK);
System.out.println ();
String UTF16 = test.toutf_16 (str);
System.out.println ("converted into UTF-16 code:" + UTF16);
GBK = Test.changecharset (Utf16,changecharset.utf_16le, CHANGECHARSET.GBK);
System.out.println ("then convert the UTF-16 code string into GBK code:" + GBK);
string s = new string ("Chinese". GetBytes ("UTF-8"), "UTF-8");
System.out.println (s);
}
}

--------------------------------------------------------------------------------------------------------------- ---


The string class in Java is encoded in Unicode, and when a string is constructed using string (byte[] bytes, string encoding), encoding refers to the data in bytes encoded in that way, Instead of the last generated string, what is encoded, in other words, is to have the system convert the data in bytes from encoding encoding to Unicode encoding. If not specified, the bytes encoding will be determined by the JDK based on the operating system.

When we read data from a file, it is best to use the InputStream method and then use String (byte[] bytes, string encoding) to indicate how the file is encoded. Do not use reader mode, because the reader method automatically converts the file content to Unicode encoding based on the encoding specified by the JDK.

When we read the text data from the database, we use the Resultset.getbytes () method to get the byte array, and also adopt the string construction method with encoding.

ResultSet rs;
bytep[] bytes = Rs.getbytes ();
String str = new string (bytes, "gb2312");

Do not take the following steps.

ResultSet rs;
String str = rs.getstring ();
str = new String (str.getbytes ("iso8859-1"), "gb2312");

This type of encoding translates into an efficient bottom. The reason for this is that the data in the default database is encoded as iso8859-1 when ResultSet is executed in the GetString () method. The system converts the data into Unicode according to the Iso8859-1 encoding method. Use Str.getbytes ("iso8859-1") to restore the data and then use the new String (bytes, "gb2312") to convert the data from gb2312 to Unicode, with a lot more steps in between.

When reading parameters from HttpRequest, the Reqeust.setcharacterencoding () method is used to set the encoding method, and the content read is correct.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.