Transferred from: http://janwer.iteye.com/blog/150226
First of all, to talk about the role of several codes in Jsp/servlet
There are several ways to set up encoding in Jsp/servlet:
-
- pageencoding = "UTF-8" (JSP)
- ContentType = "Text/html;charset=utf-8" (JSP)
- Request.setcharacterencoding ("UTF-8") (Jsp,servlet)
- Response.setcharacterencoding ("UTF-8") (Jsp,servlet)
The first two can only be used forJSP, the second two can be used toJSP andThe Servlet.
1.The role of pageencoding= "UTF-8" is to setJSP compiled intoThe encoding used by the Servlet
KnownJSP on the server is first to be compiled intoServlet's.The role of pageencoding= "UTF-8" is to tellThe JSP compiler willJSP files are compiled intoThe encoding used by the Servlet. In general, theA string defined internally by the JSP (directly in thejsp in the definition, not from the browser submitted data) garbled, a lot of this parameter is caused by the error setting. For example, your jsp files are saved with GBK for encoding, while in jsp specifies pageencoding=" UTF-8 ", will cause jsp internally defined strings are garbled.
In addition, this parameter also has a function, is not specified in the JSP contenttype parameters, do not use response.setcharacterencoding method, specifies the encoding to recode the server response.
2, contenttype= "text/html;charset= The role of UTF-8 "is to specify the encoding to recode the server response
without using Response.setcharacterencoding method, use this parameter to specify the encoding to recode the server response.
3, request.setcharacterencoding ("UTF-8") function is to set the "client request" recoding encoding.
This method is used to specify the encoding to use when the data sent by the browser is re-encoded (or decoded).
4, response.setcharacterencoding ("UTF-8") function is to specify the "server Response" recoding encoding.
The encoding is used by the server to re-encode data before it is sent to the browser.
Second, to say how the browser is to encode the data received and sent
The role of response.setcharacterencoding ("UTF-8") is to specify the encoding to recode the server response. At the same time, the browser is also based on this parameter to re-encode the data it receives (or is called decoding). So in whatever you'reSettings in JSPResponse.setcharacterencoding ("UTF-8") orResponse.setcharacterencoding ("GBK"), the browser will be able to display Chinese correctly (if you send to the browser data encoding is correct, such as the correct settingsPageencoding parameters, etc.). Can do an experiment, inSettings in JSPResponse.setcharacterencoding ("UTF-8"), inWhen the page is displayed in IE, theIn the IE menu, selectThe view(V) "--" encoding(D) "can be viewed in the"Unicode (UTF-8)" while inSettings in JSPResponse.setcharacterencoding ("GBK"), inWhen the page is displayed in IE, theIn the IE menu, selectThe view(V)--encode(D) "can be viewed in the"Simplified Chinese(GB2312) ".
When the browser sends the data, theURLs and parameters areURL encoding, the Chinese in the parameter, the browser is also to makeThe response.setcharacterencoding parameter toURL-encoded. to Baidu andGOOGLE, for example, if you search for "kanji" in Baidu, Baidu will encode it as"%ba%ba%d7%d6". And inSearch for "kanji" in GOOGLE,GOOGLE will encode it as"%e6%b1%89%e5%ad%97", this is because Baidu'sThe response.setcharacterencoding parameter isGBK, andGOOGLE'sThe response.setcharacterencoding parameter isUTF-8.
The encoding used by the browser to receive server data and send data to the server is the same, by defaultof the JSP pageresponse.setcharacterencoding parameters (or contenttype and pageencoding parameter), we call it browser encoding. Of course, in IE can modify the browser encoding (in the IE menu, choose "view (V)--and encoded (D)" modified), but typically , modifying this parameter will cause garbled characters to appear in the original correct page. An interesting example is the ie in google Homepage, the browser encoding is modified to "Simplified Chinese (GB2312)", at this time, the page of Chinese will become garbled, ignore it, Enter "Kanji" in the text box, submit, google will encode it as url encoding, using the browser encoding.
figure out how the browser encodes the data when it receives and sends the data, and then see how the server encodes the data when it receives and sends the data.
For sending data, the server followsResponse.setcharacterencoding->contenttype->pageencoding the order of precedence,encodes the data to be sent.
There are three scenarios for receiving data. One is the data that the browser submits directly with the URL, and the other two are the form'sGET andThe data submitted by the POST method.
Because of variousThe WEB server handles these three different ways, so weTomcat5.0 for example.
Regardless of whether you submit it in that way, if the parameter contains Chinese, the browser will use the current browser encoding toURL encoding.
For the formData submitted by POST, as long as the data is received in theCorrect in JSPThe request.setcharacterencoding parameter, which will encode the encoding of the client request into the browser code, will ensure that the obtained parameters are correctly encoded. One might ask, how do you get the browser code? As mentioned above, in the default case, the browser encoding is that you are ringing should requestIn the JSP pageThe value of the response.setcharacterencoding setting. So forThe data submitted by the POST form,In the JSP pageRequest.setcharacterencoding to create a JSP page that submits the form.The response.setcharacterencoding is set to the same value.
ForURL submitted in the data and formThe data that is submitted by the GET method, in the receiving dataSettings in JSPThe request.setcharacterencoding parameter is not possible because theTomcat5.0, by default, use theIso-8859-1 toURL submitted in the data and formThe data that is submitted by the GET method is re-encoded (decoded) without using this parameter toURL submitted in the data and formThe data that is submitted by the GET method is re-encoded (decoded). To resolve this issue, you shouldOf the Tomcat configuration fileConnector settings in the labelUsebodyencodingforuri orThe Uriencoding property, whereThe Usebodyencodingforuri parameter indicates whether to use theRequest.setcharacterencoding parameter PairsURL submitted in the data and formThe data that is submitted by the GET method is re-encoded, by default, the parameter isFalse (In Tomcat4.0, this parameter defaults totrue);The uriencoding parameter specifies that allGET method requests (includingURL submitted in the data and formThe data that is submitted by the GET method) for uniform recoding (decoding) encoding.Uriencoding andThe Usebodyencodingforuri difference is thaturiencoding is a uniform re-encoding (decoding) of all the data requested by the GET method, and Usebodyencodingforuri It is the re-encoding (decoding) of the data according to the request.setcharacterencoding parameter of the page that should be requested , and different pages can have different encodings (decoding). So for the data submitted by the URL and the get method submitted in the form, you can modify the uriencoding parameter to encode the browser or modify Usebodyencodingforuri to True, and in the The request.setcharacterencoding parameter in the c10>jsp page is set to the browser encoding.
Here's a summary of how to prevent Chinese garbled when Tomcat5.0 is a WEB server
- For the same application, the best unified coding, recommended for UTF-8, of course GBK can also.
- Set the pageencoding parameters of the JSP correctly
- set contenttype= "Text/html;charset=utf-8" or response.setcharacterencoding ("UTF-8") in all Jsp/servlet. Thus, the setting of the browser encoding is indirectly implemented.
- For requests, you can use a filter or set request.setcharacterencoding ("UTF-8") in each jsp/servlet. Also, to modify the default configuration of Tomcat, it is recommended to set the Usebodyencodingforuri parameter to true, or you can set the uriencoding parameter to UTF-8 (which may affect other applications, So not recommended.).
Jsp/servlet Encoding principle