Summary of Chinese transcoding issues, Chinese summary

Source: Internet
Author: User

Summary of Chinese transcoding issues, Chinese summary
1. Encoding basics 1.1 Encoding

ISO-8859-1 encoding is single-byte encoding, downward compatible with ASCII, its encoding range is 0x00-0xFF, 0x00-0x7F completely consistent with ASCII, 0x80-0x9F between control characters, 0xA0-0xFF between text symbols

Single-byte, that is, one byte corresponds to one encoding and cannot encode Chinese Characters

1.2 GBK

1) Chinese characters can be encoded. One Chinese character is encoded in two bytes.

2) encode more Chinese characters than GB2312

1.3 GB2312

Can encode Chinese characters. A Chinese character is encoded in two bytes.

1.4 UTF-8

It can encode Chinese characters. A Chinese character is encoded in three bytes.

Chinese characters, letters, special characters, gbk and UTF-8 can be converted to each other.

2. Web System Conversion code 2.1 Principle

There is a process of encoding and decoding.

The network transmission sender must encode the string into bytes.

It can be UTF-8, gbk, and so on. Encoding cannot be lost during conversion to bytes.

The receiver must use the same encoding method as the sender. Otherwise, garbled characters may occur.

Generally, the server determines a encoding and decoding method,

Then inform the client of the encoding and decoding methods.

Network Transmission code 2.2.1 receive browser POST requests

Set the browser encoding and decoding mode to UTF-8

For example:

<%@ page pageEncoding="utf-8" contentType="text/html; charset=utf-8" language="java"%>

Server decoding method 1:

String name = new String(request.getParameter("name").getBytes("ISO-8859-1"),"UTF-8");

Server decoding method 2:

request.setCharacterEncoding("UTF-8");
2.2.2 receive GET requests from browsers

For example:

Http: // localhost: 8888/webtest/EncodeServlet? Name = Hello

The browser will urlEncode the url, encoded in UTF-8

Server decoding method:

String name = new String(request.getParameter("name").getBytes("ISO-8859-1"),"UTF-8");

Request is used here. setCharacterEncoding ("UTF-8"); To set decoding, does not work, because the get request put parameters after the url for url encode, the web Container decodes the url before calling servlet, and the default decoding method is iso-8859-1.

2.2.3 respond to the browser

Response:

Response is the byte encoding method when the Response is sent to the client, the default is ISO-8859-1

You can view the information as follows:

Response. getCharacterEncoding ();

Set the encoding method of the response stream:

Response. setCharacterEncoding ("UTF-8 ");

Set the encoding and decoding methods of the browser:

Response. setContentType ("text/html; charset = UTF-8 ");

Jsp settings:

<% @ Page pageEncoding = "UTF-8" contentType = "text/html; charset = UTF-8" language = "java" %>

PageEncoding: sets the jsp file storage encoding.

Charset in contentType: sets the encoding and decoding of browser-side transmission.

Decoding when parsing the response, encoding when sending the request

The response stream and encoding must be consistent with the browser decoding method to avoid garbled characters.

2.2.4 HTTPClient Encoding

Configure Struts. xml as follows:

<Constant name = "struts. i18n. encoding" value = "UTF-8"> </constant>

2.4 Spring Control Code

The configuration in Web. xml is as follows:

<filter><filter-name>encodingFilter</filter-name> <filter-class>org.springframework.web.filter.CharacterEncodingFilter</filter-class><init-param><param-name>encoding</param-name><param-value>UTF-8</param-value></init-param><init-param><param-name>forceEncoding</param-name><param-value>true</param-value></init-param></filter>

Encoding sets the server encoding and decoding methods.

ForceEncoding indicates the forced encoding method.

3. Convert string to byte Transcoding
String s = "s Han"; byte [] bytes1 = s. getBytes ("ISO-8859-1"); // lost character byte [] bytes2 = s. getBytes ("GBK"); byte [] bytes3 = s. getBytes ("UTF-8 ");
4. Convert byte to string
String s1 = new String (bytes1, "UTF-8"); // lost String s2 = new String (bytes2, "GBK"); String s3 = new String (bytes3, "UTF-8 ");

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.