Solving the problem of JSP Chinese garbled by hundred percent solution

Source: Internet
Author: User

Briefly summarize:

Method= "Get", <%string name=request.getparameter ("name");

String Output=new string (name.getbytes ("iso-8859-1"), "UTF-8");

%>

<%=output%>

Method= "Post", <%request.setcharacterencoding ("Utf-8");

String name=request.getparameter ("name");

<%=name%>

Here's how it works:

First, talk about the role of several encodings in Jsp/servlet.
In Jsp/servlet, there are several places where you can set up code, pageencoding= "UTF-8", contenttype= "Text/html;charset=utf-8", Request.setcharacterencoding ("UTF-8") and Response.setcharacterencoding ("UTF-8"), where the first two can only be used in the JSP, The latter two can be used in JSPs and in the servlet.

1, the role of pageencoding= "UTF-8" is to set the code that the JSP uses when compiling the servlet.

As we all know, JSP on the server is to be compiled into a servlet first. The role of pageencoding= "UTF-8" is to tell the JSP compiler the encoding used to compile the JSP file into a servlet. Typically, the strings that are defined inside the JSP (defined directly in the JSP, rather than the data submitted from the browser) are garbled, many of which are caused by the parameter setting error. For example, your JSP file is stored in GBK code, and in the JSP but the pageencoding= "UTF-8", will cause the internal definition of the JSP string is garbled.

In addition, this parameter also has the function of not specifying the contenttype parameter in the JSP or using the Response.setcharacterencoding method to specify the encoding for the server response to be encoded.

2, the role of Contenttype= "Text/html;charset=utf-8" is to specify the encoding of the server response to be encoded.

Use this parameter to specify the encoding for the server response to be encoded when the Response.setcharacterencoding method is not used. The code is used when the server is recoding data before it is sent to the browser.

3. The role of request.setcharacterencoding ("UTF-8") is to set the encoding of the client request for recoding.

This method is used to specify the encoding used when the data sent by the browser is to be encoded (or decoded).

4. The role of response.setcharacterencoding ("UTF-8") is to specify the encoding of the server response to be encoded.

The code is used when the server is recoding data before it is sent to the browser.

Second, to say how browsers encode the data that is received and sent

The role of response.setcharacterencoding ("UTF-8") is to specify the encoding that encodes the server response. At the same time, the browser is also based on this parameter to the data it received to the recoding (or called decoding). So wherever you set response.setcharacterencoding ("UTF-8") or response.setcharacterencoding ("GBK") in the JSP, Browsers are able to display Chinese correctly (provided that the data you send to the browser is encoded correctly, such as pageencoding parameters are correctly set). Readers can do an experiment, in the JSP set Response.setcharacterencoding ("UTF-8"), in IE to display the page, in IE's menu select "View (V)" à "encoding (D)" can be viewed in the " Unicode (UTF-8), and in the JSP set Response.setcharacterencoding ("GBK"), in IE to display the page, in IE's menu select "View (V)" à "encoding (D)" can be viewed in the " Simplified Chinese (GB2312) ".

When the browser sends the data, the URL and the parameters are encoded, and the Chinese in the parameters, the browser also uses the response.setcharacterencoding parameter to encode the URL. Take Baidu and Google as an example, if you search for "Chinese characters" in Baidu, Baidu will encode it as "%ba%ba%d7%d6". Google searches for "Chinese characters", which it encodes as "%e6%b1%89%e5%ad%97", This is because Baidu's response.setcharacterencoding parameter is GBK, and Google's response.setcharacterencoding parameter is UTF-8.

The browser uses the same code to receive server data and send data to the server, By default, the response.setcharacterencoding parameters (or contenttype and pageencoding parameters) of the JSP page are referred to as browser encodings. Of course, in IE you can modify the browser code (in IE's menu to choose "View (V)" à "encoding (D)" in the modified), but usually, modify this parameter will make the correct page of the original garbled. An interesting example of this is that in IE browse Google's homepage, the browser code modified to "Simplified Chinese (GB2312)", at this time, the page will become garbled Chinese, ignore it, in the text box, enter "Chinese characters", submitted, Google will be encoded as "%BA%BA%D7%D6" , it can be seen that browsers encode the URL in Chinese, using the browser code.

When we find out how the browser encodes the data when it receives and sends the data, let's look at how the server encodes the data when it receives and sends the data.

For sending data, the server encodes the data to be sent according to the response.setcharacterencoding-contenttype-pageencoding order of precedence.

For receiving data, there are three different situations. One is the data that the browser submits directly with the URL, and the other two are data submitted by the form's get and post methods.

Because a variety of Web servers handle these three different ways, let's take Tomcat5.0 as an example.

Regardless of the manner in which it is submitted, if the parameter contains Chinese, the browser encodes it using the current browser encoding.

For the data submitted by post in the form, as long as the parameters are correctly request.setcharacterencoding in the JSP receiving the data, the encoded encoding of the client request will be set into the browser code, so that the obtained parameters are encoded correctly. Some readers may ask, how to get the browser code. As we mentioned above, by default, the browser encoding is the value that you response.setcharacterencoding set in the JSP page that should be requested. So for the data submitted by the Post form, in the JSP page where the data is obtained, request.setcharacterencoding is set to the same value as the response.setcharacterencoding that generates the JSP page that submits the form.

For the data submitted by the URL and the data submitted by get in the form, It is not possible to set the request.setcharacterencoding parameter in the JSP that receives the data, because in Tomcat5.0, the data submitted by the URL and the method submitted by the form are iso-8859-1 (decoded) by default, instead of using the parameter to the U RL submits the data and the form of the data submitted by get way to encode (decode). To resolve this issue, you should set the Usebodyencodingforuri or Uriencoding attribute in the Connector tab of the Tomcat configuration file. Where the Usebodyencodingforuri parameter indicates whether the data submitted by the URL is to be encoded with the request.setcharacterencoding parameter, or by default, False (the parameter in Tomcat4.0 defaults considers true); The uriencoding parameter specifies the encoding of a uniform recoding (decoding) of all get-mode requests, including data submitted by the URL and a get-mode submission in the form. The difference between uriencoding and Usebodyencodingforuri is that uriencoding is a unified recoding (decoding) of the requested data for all get methods, And Usebodyencodingforuri is based on the requested page of the request.setcharacterencoding parameters of the data recoding (decoding), different pages can have different recoding (decoding) of the encoding. So for the data submitted by the URL and the data submitted by get in the form, you can modify the uriencoding parameter to encode the browser or modify Usebodyencodingforuri to true. And the request.setcharacterencoding parameter is set to browser encoding in the JSP page where the data is obtained.

Under the summary, to Tomcat as a Web server, how to prevent Chinese garbled.

1, for the same application, the best unified coding, recommended for UTF-8, of course, GBK also can.

2, correctly set the JSP pageencoding parameters

3, in all Jsp/servlet set contenttype= "Text/html;charset=utf-8" or response.setcharacterencoding ("UTF-8"), This indirectly implements the settings for the browser encoding.

4. For requests, you can use filters or set request.setcharacterencoding ("UTF-8") in each jsp/servlet. Also, to modify the default configuration for Tomcat, it is recommended that you set the Usebodyencodingforuri parameter to True, or you can set the uriencoding parameter to UTF-8 (which may affect other applications, so it is not recommended).

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.