Differences between contenttype, charset, and pageencoding

Source: Internet
Author: User
Differences between contenttype, charset, and pageencoding

========================================================= ==================

The contenttype attribute specifies the HTTP content type of the response. If contenttype is not specified, the default value is text/html. Syntax response. contenttype [= contenttype] parameter contenttype

 

Pageencoding is the code of the JSP file.

The charset of contenttype indicates the content encoding when the server sends a message to the client.

JSP requires two "encodings". pageencoding is used in the first stage, UTF-8 to UTF-8 is used in the second stage, and the third stage is the webpage from tomcat, with contenttype.

The first stage is JSP compilation. java, it will read JSP according to pageencoding settings, the result is translated by the specified encoding scheme into a unified UTF-8 Java source code (that is. java). If pageencoding is set incorrectly or is not set, Chinese garbled characters are displayed.

The second stage is the Java source code of javac to the compilation of Java bytecode, No matter what encoding scheme is used in JSP writing, after this stage the results are all the Java source code of the UTF-8 encoding.

 

 

 

Pageencoding: sets the character set encoding in the JSP source file and response body.
Contenttype: sets the character set encoding and Mime Type of the JSP source file and response body.

It can be seen that both pageencoding and contenttype can set character set encoding in the JSP source file and response body. But there are also differences:
Set the priority to pageencoding> contenttype when setting the character set of the JSP source file. If none are set, the default ISO-8859-1.
Set the priority of the response output character set to contenttype> pageencoding. If none are set, the default ISO-8859-1.

It can be simply considered that pageencoding is the code of the JSP file, and the charset of contenttype is the content encoding that the server sends to the client. For example, pageencoding = "GBK ". This statement tells the JVM that the JSP uses the "GBK" encoding. When the JSP is compiled into a Servlet and passed to the JVM, the "GBK" encoding method is used to translate the JSP webpage source file into a unified Java bytecode in the form of UTF-8. If no setting is added, JVM uses the ISO-8859-1 encoding method by default. Charset = GBK in contenttype indicates that the output mode of the webpage file to the browser is GBK. In this process, the source file of a JSP must go through three stages and be encoded twice to complete the output.

Stage 1: Compile JSP into a servlet (. Java) file. The instruction used is pageencoding. According to the instruction of pageencoding = "XXX", find the encoding rule as "XXX". The server is compiling the JSP file. java files will be read according to pageencoding settings JSP, the result is translated by the specified encoding scheme into a unified UTF-8 coding Java source code (that is. java ).
Phase 2: From servlet files (. Java) to Java bytecode files (. Class), from UTF-8 to UTF-8. At this stage, no matter what encoding scheme is used in JSP writing, the results of this stage are all the Java source code of the UTF-8 encoding. Javac uses the UTF-8's encoding to read the Java source code and compile it into a UTF-8-encoded binary code (that is,. Class), which is the JVM's specification for expressing constant strings in binary code (Java encoding. This process is determined by the internal specifications of the JVM and is not controlled by the outside world.
Stage 3: from the server to the browser, the Command Used in the process is contenttype. The server loads and executes the Java binary code generated in the second stage. The output result is the result visible to the client. In this output process, it is specified by charset in the contenttype attribute, encode the UTF-8 binary code in charset format. If not set manually, the form of the ISO-8859-1 is defaulted.

 

 

=========================================== ==================

 

"Contenttype" (a string that describes the content type. This string is usually formatted as a type/subtype, where the type is a common content category and the subclass is a specific content type)

In a word, the server response client responds to the "contenttype" type. This is easy to understand, but I found the problem in Baidu encyclopedia. In contenttype, there is an attribute specified by charset encoding, while pagencoding is also encoded, what are the differences between the two encodings?

I have a deep understanding of the materials!

Pageencoding is the code of the JSP file.

The charset of contenttype indicates the content encoding when the server sends a message to the client.

JSP requires two "encodings". pageencoding will be used in the first stage, UTF-8 to UTF-8 will be used in the second stage, and the third stage is the webpage from tomcat, with contenttype

The first stage is JSP compilation. java, it will read JSP according to pageencoding settings, the result is translated by the specified encoding scheme into a unified UTF-8 Java source code (that is. java). If pageencoding is set incorrectly or is not set, Chinese garbled characters are displayed.

The second stage is the Java source code of javac to the compilation of Java bytecode, No matter what encoding scheme is used in JSP writing, after this stage the results are all the Java source code of the UTF-8 encoding.

Javac uses the UTF-8's encoding to read the Java source code and compile it into the UTF-8's encoding binary code (that is,. Class), which is the JVM's specification for the constant string expression in the binary code (Java encoding.

The third stage is the Java binary code loaded and executed by Tomcat (or its application iner INER) in Stage 2. The output result is displayed on the client, in this case, the parameter contenttype hidden in phase 1 and phase 2 is effective.

Contenttype settings.

The presets of pageencoding and contenttype are both ISO8859-1. if you set either of them, the other will be the same (tomcat4.1.27 is the same ). but this is not absolute. It depends on the jspc processing method. pageencoding is not equal to contenttype,

<% @ Page contenttype = "text/html; charset = UTF-8" %>

I remember the teacher encountered the following situation when talking in class. His solution was to change UTF-8 to GBK,

<% @ Page contenttype = "text/html; charset = GBK" %>

It seems that the principle of changing one of them is used. In fact, the formal method should be

<% @ Page contenttype = "text/html; charset = UTF-8" pageencoding = "GBK" %>

However, if this is changed, the Chinese characters received by the server are not garbled, but the characters opened on the client are garbled because charset = UTF-8 is specified in charset, the output to the client is UTF-8 encoded. Therefore, you should change this method to a regular one.

<% @ Page contenttype = "text/html; charset = GBK" pageencoding = "GBK" %>

It seems that writing like this is not as good

<% @ Page contenttype = "text/html; charset = GBK" %>

Simple. It seems that I will use this simple method in the future!

It is purely an individual's self-taught understanding. If the error persists

 

=================================================== ================================

Glossary and functions

1. contenttype: <% @ page contenttype = "text/html; charset = UTF-8" %>

2. pageencoding: <% @ page pageencoding = "UTF-8" %>

3. html page charset: <meta http-equiv = "Content-Type" content = "text/html; charset = UTF-8">

4. setcharacterencoding: request. setcharacterencoding (), response. setcharacterencoding ()

5. setcontenttype: Response. setcontenttype ()

6. setheader: Response. setheader ()

7. jsp page encoding: encoding of the JSP file itself

8. encoding for displaying web pages: encoding for displaying JSP output streams in the browser

9. Web page input encoding: font encoding entered in the input box

10. Request stream input by the Web server: request data of the corresponding Web Server Browser

11. Response stream output by the Web server: output data of the corresponding Web Server Browser

Their mutual influence, scope, and order of action

1. pageencoding: only indicates the encoding format of the JSP page, which has nothing to do with the encoding displayed on the page;

When the container reads (file), (database), or (String constant), it converts it to Unicode used internally.

The page content is displayed after the internal Unicode is converted to the encoding specified by contenttype;

If the pageencoding attribute exists, the character encoding method of the JSP page is determined by pageencoding,

Otherwise, it is determined by the charset in the contenttype attribute. If the charset does not exist, the character encoding method of the JSP page will be used.

The default ISO-8859-1.

2. contenttype: Specifies the MIME type and the character encoding method for JSP page responses. The default value of the MIME type is "text/html ";

The default value of the character encoding method is "ISO-8859-1". The MIME type and the character encoding method are separated by semicolons;

Relationship between pageencoding and contenttype:

1. The content of pageencoding is only used for encoding during JSP output and will not be sent as a header. It tells the Web Server

The encoding of the JSP page, that is, the encoding of the response stream output by the Web server;

2. In the first stage, JSP is compiled into. Java, which reads the JSP according to the pageencoding setting. The result is translated by the specified encoding scheme.

Java source code (. Java ).

3. The second stage is the compilation of Java source code from javac to Java bytecode, No matter what encoding scheme is used in JSP writing,

After this stage the results are all UTF-8's encoding Java source code. javac read with UTF-8's Encoding

Java source code, compiled into the binary code of the UTF-8 encoding (namely. Class), which is the JVM constant string in the binary code

(Java encoding.

4. The third stage is the Java binary code from Tomcat (or its application container) load and execution phase 2,

The output result is displayed on the client. In this case, the parameter contenttype hidden in phase 1 and phase 2 is effective.

The setting method with the same effect as contenttype is the charset, response. setcharacterencoding (),

Response. setcontenttype (), response. setheader (); response. setcontenttype (),

Response. setheader (); the highest priority, followed by response. setcharacterencoding ();

<% @ Page contenttype = "text/html; chareset = GBK" %> and <meta http-equiv = "Content-Type"

Content = "text/html; charset = gb2312"/>.

5. Enter the encoding for the web page: Specify the page input encoding when setting the page encoding <% @ page contenttype = "text/html; chareset = GBK" %>;

If the display of the page is set to a UTF-8, all the page inputs of the user are encoded according to the UTF-8; the server side program is reading

Set the input encoding before entering the form;

After the form is submitted, the browser converts the form field value to the byte value corresponding to the specified character set, and then according to the HTTP standard URL

The encoding scheme encodes the result byte, but the page must tell the server the encoding method of the current page;

Request. setcharacterencoding (), can modify the serverlet to get the Request Encoding, response. setcharacterencoding (),

Can modify the encoding of serverlet returned results.

Or use the following description:

  • Pageencoding is the character encoding format of the source code of the JSP page.If the value of this item is UTF-8, Chinese characters cannot be written in the JSP Source Code. If you are using eclipse or other tools, an error will be prompted during saving, it's okay to change to GBK.
  • Charset is the character encoding of the content returned after the request ServerEven if pageencoding is configured with GBK, save and run the program, you will find that the Chinese characters you just wrote cannot be properly displayed, and change charset to GBK.

JSP requires two "encodings". pageencoding is used in the first stage, UTF-8 to UTF-8 is used in the second stage, and charset is used for the webpage returned by tomcat in the third stage.

The first stage is JSP compilation. java, it will read JSP according to pageencoding settings, the result is translated by the specified encoding scheme into a unified UTF-8 Java source code (that is. java). If pageencoding is set incorrectly or is not set, Chinese garbled characters are displayed.

The second stage is the Java source code of javac to the compilation of Java bytecode, No matter what encoding scheme is used in JSP writing, after this stage the results are all the Java source code of the UTF-8 encoding.

The third stage is the loading and execution phases of Tomcat (or its application container). The output result is displayed on the client, in this case, the parameter contenttype hidden in phase 1 and phase 2 is effective.

Note: When setting JSP page source code character encoding, if there is pageencoding this item, then take the value of this item, if not, take charset value, if none, take iso8859-1. The presets of pageencoding and contenttype are both ISO8859-1. if you set either of them, the other will be the same (tomcat4.1.27 is the same ). but this is not absolute. It depends on the processing method of the respective JSP containers.

For example, if pageencoding is set in JSP in Tomcat, The contenttype is also set to the same encoding, but not in resion. The default value will also be used in resin, this can be seen by viewing the compiled servlet-like Java file, and the problem is exactly here. Therefore, in JSP, it is recommended to set these two attributes separately under resin.

Summary: We usually set <% @ page contenttype = "text/html; charset = gb2312" %> On the JSP page.

Category:
JSP

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.