in a Web application often need to pass some parameters to the server, typically through form to send a POST request to the server. In the parameter may contain the Chinese information, like the user information registration, the shopping order address information and so on. Parameter strings are generally encoded in a local character set, such as the GB2312 or GBK character set in Chinese, the iso8859_1 character set in English or Western European text, but Unicode processing strings are used in Java programs, which requires a process of encoding conversion. Unfortunately, most of the existing Java application servers are developed in English-speaking countries, because of the lack of large character set (Chinese, Japanese, Korean, etc.) application environment, these application servers in processing HTTP request parameters have some problems in Chinese processing, It is also the most disturbing problem for JSP and servlet developers.
the root cause of this problem is the lack of sufficient information in the HTTP request to indicate the character set used by the client. In a JSP page, we can use the following pseudo instruction to indicate the character set used by the output page:
The
JSP engine converts the above pseudo instruction to the head of the HTTP reply:
content-type:text/html; charset=gb2312
-like output is the use of GB2312 encoded in the Chinese page, the browser will correctly display Chinese. However, the browser does not include charset when the content of the form is posted to the server, and the Chinese content is encoded in the form of%xx (xx is a hexadecimal number), such as the GB2312 inner code of the Chinese character "in" as 0xd6d0, which becomes%d6%d in the HTTP request 0, according to RFC2616, if the character set is not specified in the HTTP request, the ISO8859_1 encoding is used, so that the word "medium" becomes two characters when it is processed, ´u00d6´ and ´u00d0´ respectively, and returns to the client and becomes two characters that cannot be displayed. Browsers generally appear as ´?? ´.
the traditional way to solve this problem is to write additional code to complete the conversion of the character set:
Strout = new String (strin.getbytes ("8859_1"), "GB2312");
Strin is a string that is not converted and is encoded as a converted string that is encoded as GB2312.
implemented the Java Servlets 2.3 specification in the Apusic 0.9.5 version, where a new method ServletRequest (String setcharacterencoding) was added to the enc interface. The charset information that is missing in the HTTP request can be filled in, and the cumbersome conversion process is done automatically in the Servlet engine, and the servlet engine optimizes the conversion process to improve operational efficiency. Here is a simple example, you can do a comparison.
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.