A Study of JSP Chinese garbled characters

Source: Internet
Author: User

Java kernel and class files are Unicode-based, which enables JavaProgramIt has good cross-platform performance, but it also brings some troubles about Chinese garbled characters. There are two main reasons: the garbled problem generated during compilation of Java and JSP files and the garbled problem caused by interaction between Java programs in other media.

First, Java (including JSP) source files may contain Chinese characters, while Java and JSP source files are stored Based on byte streams. If Java and JSP are compiled into class files, if the encoding method used is inconsistent with the source file encoding, garbled characters may occur. For JSP, adding or basically solving this type of gibberish problem to the file header. This articleArticleThis article mainly focuses on the research on deploying JSP pages to Tomcat.

First, open the IDE. Here I use eclipse. First, create a web project test, and then create a JSP page as follows:

 
<% @ Page Language = "Java" pageencoding = "GBK" %> <% @ page contenttype = "text/html; charset = iso8859-1 "%> <HTML> JSP Chinese garbled characters</Body>  

Note that the JSP page is also encoded when it is saved. In eclipse and other ide, there is a configuration dedicated to setting file encoding, which is not mentioned here. By default, all JSP files are saved in UTF-8.

Then, deploy the project to Tomcat and access http: // localhost: 8080/test/index. jsp. The page (index. jsp) is garbled.

 When we deploy and run JSP in Tomcat, JSP will undergo two "encoding", the first stage will use pageencoding, and the second stage will use UTF-8 to UTF-8, the third stage is the web page that comes out of Tomcat and uses contenttype.

The first stage is JSP compiled into. Java, it will read JSP according to pageencoding settings, the result is translated by the specified encoding scheme into a unified UTF-8 Java source code (that is. Java ).

The second stage is compiled by Java source code of javac to. class, no matter what encoding scheme is used in JSP writing, after this stage the results are all the Java source code of UTF-8 encoding.

Javac uses the UTF-8's encoding to read the Java source code and compile it into the UTF-8's encoding binary code (that is,. Class), which is the JVM's specification for the constant string expression in the binary code (Java encoding.

The third stage is the Java binary code loaded and executed by Tomcat (or its application iner INER) in Stage 2. The output result is displayed on the client, the contenttype parameter takes effect.

Now let's take a closer look at the three phases above:

The first stage is compiled by JSP. java needs to read the source JSP page to the memory, and then generate servlet-class files in some encoding mode. What encoding method will be used to read this read process? For example, if pageencoding = "XX" is set on the page, the server reads the JSP page using the XX set above. If not set, check whether contenttype = "text/html; charset = xxx" is set. If yes, use the encoding method specified in charset to read the data. If neither of these are timed, the default encoding method is used to read the ISO8859-1, when the JSP page is read to the memory, in the memory in Unicode encoding, then it is output in UTF-8 encoding mode. java file.

Below we use the followingCodeTo verify:

String filename = "D:/program files/development/Apache Software Foundation/tomcat 6.0/webapps/servlettest/index. jsp "; //  Location after JSP is deployed to Tomcat Printwriter PW = New Printwriter ( New Outputstreamwriter ( New Fileoutputstream ("D:/index. Java"), "UTF-8"); Bufferedreader br = New Bufferedreader ( New  Inputstreamreader (  New Fileinputstream (filename), "GBK" ); // The Jap file is stored in UTF-8, so garbled string readcontent is generated when it is read in GBK. = Null  ;  While (Readcontent = Br. Readline ())! = Null  ) {PW. Write (readcontent + "/R/N"); PW. Flush ();} PW. Close (); 

How do you run the above Code to generate an index. java file, and then we will compare the real. java file, which is located under % atat_home % \ work \ Catalina \ localhost \ test \ org \ apache \ JSP, after opening it, you will find that the Chinese character garbled characters are the same. Now you should understand the general process of the first stage and the conversion of the encoding.

In the second stage, we will not introduce it too much, because this stage is the compilation process of Java source code to. class, using UTF-8 to UTF-8, the encoding format will not change after conversion.

In the third stage, we will use <% @ page contenttype ="Text/html; charset = iso8859-1"%>, At this time we view the servlet bytecode generated in the second stage, will find response. setcontenttype ("text/html; charset = iso8859-1"); this sentence is

<% @ Page contenttype ="Text/html; charset = iso8859-1"%> It is compiled into the servlet bytecode. The user's browser requests the servlet corresponding to the JSP, the Web Container starts a thread to execute the servlet, the data is returned to the client browser, to display the iso8859-1, the above process is similar to the following code:

String filename = "D:/index. Java "; // Files generated in the first phase Printwriter PW = New Printwriter ( New Outputstreamwriter ( New Fileoutputstream ("D:/index1.java"), "iso8859-1" ); Bufferedreader br = New Bufferedreader ( New  Inputstreamreader (  New Fileinputstream (filename), "UTF-8" ); String readcontent = Null ;  While (Readcontent = Br. Readline ())! = Null  ) {PW. Write (readcontent + "/R/N" ); PW. Flush ();} PW. Close (); 

The above is the whole process displayed on the JSP page. It is only the personal experience of personal reference to online materials. If there is anything wrong, please point out, thank you!

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.