Summary of Java encoding (Chinese transcoding)

Source: Internet
Author: User

This chapter mainly analyzes the principle of Java coding and decoding, and the problems of Chinese transcoding to make a simple summary

Directory

1 Basics of coding

ISO-8859-1 encoding

GBK

GB2312

UTF-8

2 Web System conversion encoding

principle

Servlet Network transfer encoding

STRUTS2 Control Code

Spring Control Code

3 String bytes

4-byte-to- string


1 Basics of Coding

ISO-8859-1 encoding

iso-8859-1 encoding is single byte encoded, backwards compatible with ASCII, whose encoding range is 0x00-0xff,0x00-0x7f between full and ASCII, 0x80-0x9f is the control character , 0xa0- between 0xFF is a text symbol

single byte, i.e. one byte corresponding to one encoding, cannot encode Chinese characters         

GB2312

Can encode Chinese characters, a Chinese character encoded with 2 bytes

GBK

1) can encode Chinese characters, a Chinese character encoded with 2 bytes

2) encode more Chinese characters than GB2312

UTF-8

1) can encode Chinese characters, a Chinese character encoded with 3 bytes

2) The range contains Chinese characters, letters, special symbols, GBK and utf-8 can be converted to each other


2 Web System conversion encoding

Principle

We analyze the service-and client-side patterns, for example, the browser is the client and the Web server is the service side.

Here is the process of encoding and decoding, the client side needs to encode the string into bytes, can be ISO-8859-1,UTF-8,GBK, and so on, the default is Iso-8859-1,

And the encoding cannot be lost during the conversion to bytes. The server needs to be decoded with the same encoding as the sender, otherwise it will appear garbled.

Typically, the service side determines how to encode and decode, and then tells the client how to encode and decode.

Servlet Network transfer encoding

Receiving a browser POST request

In the case of JSP, the server sends the JSP generated HTML to the client.

Set the browser encoding and decoding method to UTF-8

For example:

<%@ page pageencoding= "Utf-8" contenttype= "text/html; Charset=utf-8 "language=" java "%>

Service-side decoding mode 1:

String name = new String (Request.getparameter ("name"). GetBytes ("Iso-8859-1"), "UTF-8");

Service-side decoding mode 2:

Request.setcharacterencoding ("UTF-8");

Receive browser GET requests

such as: Http://localhost:8888/webtest/EncodeServlet?name= Hello

The browser will urlencode the URL and encode it as UTF-8

Service-Side decoding method:

String name = new String (Request.getparameter ("name"). GetBytes ("Iso-8859-1"), "UTF-8");

Setting the decoding here with request.setcharacterencoding ("UTF-8") does not work because the GET request is to spell the parameters behind the URL for URL encode, and the Web container decodes the URL before calling the servlet. and the default decoding method is Iso-8859-1

Responding to a browser

Response Set the encoding:

Response refers to the encoding of bytes when responding to a client, by default Iso-8859-1

This can be viewed in the following ways:

Response.getcharacterencoding ();

To set how the response stream is encoded:

Response.setcharacterencoding ("UTF-8");

To set the encoding and decoding method of the browser:

Response.setcontenttype ("Text/html;charset=utf-8");

JSP settings:

<%@ page pageencoding= "Utf-8" contenttype= "text/html; Charset=utf-8 "language=" java "%>

Pageencoding: Setting the JSP file storage encoding

ContentType inside the CharSet: Set the encoding and decoding of the browser-side transfer

Decoding when parsing a response, encoding when sending a request

To keep the response stream and encoding and browser decoding the same way, not garbled


HttpClient Setting the Encoding


STRUTS2 Control Code

The following configuration is done in Struts.xml:

<constant name= "struts.i18n.encoding" value= "Utf-8" ></constant>

Spring Control Code

The configuration in Web. XML is as follows:

<filter><filter-name>encodingFilter</filter-name><filter-class> Org.springframework.web.filter.characterencodingfilter</filter-class><init-param><param-name >encoding</param-name><param-value>utf-8</param-value></init-param><init-param ><param-name>forceencoding</param-name><param-value>true</param-value></ Init-param></filter>

Where encoding sets the service-side encoding and decoding methods

Forceencoding indicates how the encoding is enforced


3 String to byte transcoding

String s = "s-han"; byte[] bytes1 = s.getbytes ("iso-8859-1");//Lost character byte[] Bytes2 = s.getbytes ("GBK"); byte[] Bytes3 = S.getbyte S ("UTF-8");


4-byte -to-string

string S1 = new String (bytes1, "utf-8");//missing string s2 = new String (Bytes2, "GBK"); String s3 = new String (Bytes3, "utf-8");
















Summary of Java encoding (Chinese transcoding)

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.