JSP Chinese coding problem (on)

Source: Internet
Author: User

Summary:

This paper first introduces a JSP's source file execution process, that is, after three stages, two times coding, to complete a complete output. It is particularly noteworthy that in this process, the coding problem runs throughout. We know that in Jsp/servlet, there are four main ways to set up coding, namely pageencoding, ContentType, request.setcharacterencoding, and Response.setcharacterencoding, in this article, we carry on the thorough introduction and the summary to these four kinds of ways. I. JSP implementation process and Coding settings Overview

In Jsp/servlet, there are four main ways to set up encodings, where the first two can only be applied to the JSP, and the last two can be used in JSPs and Servlet. Pageencoding= "UTF-8"; Contenttype= "TEXT/HTML;CHARSET=UTF-8; Request.setcharacterencoding ("UTF-8"); Response.setcharacterencoding ("UTF-8").

Pageencoding is the encoding of the JSP file itself, and the charset of ContentType is the content encoding of the server when it is sent to the client;

JSP to go through the "code" two times, the first phase will use Pageencoding, the second phase will be used Utf-8 to Utf-8, the third stage is from Tomcat out of the Web page, with the ContentType.

In fact, a JSP source file needs to go through three stages, two times to encode, to complete a complete output, these three stages are:

  First stage: translation (. jsp->. Java;pageencoding-> UTF-8). to compile the JSP into a servlet (. java) file, the instruction used is pageencoding. In the compilation process, according to the pageencoding= "xxx" instructions, find the encoded rule is "xxx", The server then compiles the JSP file into a. java file and reads the JSP according to the pageencoding settings, resulting in the translation of the specified encoding scheme into a unified UTF-8 coded Java source code (i.e.. java).

  Phase two: Compile (. java->. Class;utf-8-> UTF-8). from the servlet file (. java) to the Java bytecode file (. Class), from UTF-8 to UTF-8. In this phase, no matter what the JSP is written in the encoding scheme, through this phase of the results are all UTF-8 encoding Java source. Javac uses UTF-8 's encoding to read Java source code, compiled into a UTF-8 encoded binary code (that is,. Class), which is the JVM's specification for regular numeric strings expressed in binary code (Java encoding). This process is determined by the internal specification of the JVM and is not subject to external control.

  Phase III: the Java binaries for Tomcat (or other container) load and execute phase two, and output results to the client (UTF-8-> contentType) process. from the server to the browser, the instructions used in this process are contenttype. Server load and execution generated from the second phase Java binary code, output results, that is, the client can see the results, in this output, by the ContentType property in the charset to specify, the UTF8 form of the binary code in CharSet encoded form to output. If there is no artificial setting, the default is the form of iso-8859-1.

Notably, thedefault value for Pageencoding is "iso-8859-1", and ContentType's default value is "Text/html;iso-8859-1".

Note: The first and 32 stages of the transcoding personal feel associated with the sting transcoding is easier to understand, for example: New String (Name.getbytes ("iso-8859-1"), "Utf-8". two. pageencoding= "UTF-8"

  The role of pageencoding= "UTF-8" is to set the encoding that the JSP uses when compiling into a servlet. Typically, the strings that are defined inside the JSP (defined directly in the JSP, rather than the data submitted from the browser) are garbled, many of which are caused by the parameter setting error. For example, if you have a Chinese character in your JSP file, and you specify pageencoding= "Iso-8859-1" in the JSP, you can cause the Chinese character to display an exception. Look at the following example:

<%@ page language= "java" pageencoding= "iso-8859-1" import= "java.util.*"%>


     
     

After it has been compiled into a servlet, its source code (fragment) looks like this:

public void _jspservice (HttpServletRequest request, httpservletresponse response)
        throws Java.io.IOException, servletexception {

      //...

      Out.write ("
     
     

Visit this page, and the page appears as follows:

               

as we can see, the pageencoding is specified as "iso-8859-1", causing it to use "iso-8859-1" in the process of compiling a JSP file into a. java file by the server. Read JSP and translated into a unified UTF-8 coded Java source code, all of the Chinese characters are turned into garbled, and make it appear to the user's response also contains garbled. Specifically, this property also has the ability to specify the content of the server response to be encoded when the contenttype parameter is not specified in the JSP or when the Response.setcharacterencoding method is used. two. contenttype= "Text/html;charset=utf-8"

  contenttype= "Text/html;charset=utf-8" The role is to be generated by the second stage of the UTF8 form of the binary code to the charset of the encoding form to the client, if not set up incorrectly, will appear garbled. Look at the following example:

<%@ page language= "java" contenttype= "text/html;iso-8859-1" import= "java.util.*" 
    pageencoding= "Utf-8"%>


     
     

After it has been compiled into a servlet, its source code (fragment) looks like this:

public void _jspservice (HttpServletRequest request, httpservletresponse response)
        throws Java.io.IOException, servletexception {

      //...

      Out.write ("
     
     

Visit this page, and the page appears as follows:

                three. Request.setcharacterencoding ("UTF-8")

  request.setcharacterencoding ("UTF-8") is used to specify that the data sent by the browser is to be encoded in a specific character set, which is often used to decode the POST request parameters. See the next blog JSP Chinese coding problem (next) in the "POST request request parameters for the Chinese case" section. four. response.setcharacterencoding ("UTF-8")

  the function of response.setcharacterencoding ("UTF-8") is to encode the response using the specified character set before the server returns the response to the browser. Once this is used, even if the response page specifies a specific contentType, it will be invalidated, that is, the Response.setcharacterencoding () method overrides the ContentType value. Look at the following example:

<%@ page language= "java" contenttype= "text/html;iso-8859-1" import= "java.util.*" 
    pageencoding= "Utf-8"%>


     
     

After it has been compiled into a servlet, its source code (fragment) looks like this:

public void _jspservice (HttpServletRequest request, httpservletresponse response)
        throws Java.io.IOException, servletexception {
      response.setcharacterencoding ("UTF-8")
      //...

      Out.write ("
     
     

Visit this page, and the page appears as follows:

                Five. Interaction between four coding settings and priority of Action

Based on the above, we have reached the following three points:

when a specified JSP is compiled into a servlet, the priority is: pageencoding= "UTF-8" > contenttype= "Text/html;charset=utf-8"

when you specify that the server encodes the response content, the priority is: response.setcharacterencoding ("UTF-8") > contenttype= "Text/html;charset=utf-8" > pageencoding= "UTF-8"

request.setcharacterencoding ("UTF-8") is used only to specify the decoding of request data sent to the browser.

This article from: JSP Chinese garbled problem ultimate solution (ON)

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.