JSP page encoding
Page character encoding refers to the encoding format of the JSP file and the tag file itself.
- If <PAGE-ENCODING> is configured in the <jsp-config> element in Web. XML, the Pageencoding property of the page directive on the pages must be the same as in Web. XML <jsp-config> The <page-encoding> element in the element specifies the page-encoding consistent, otherwise it will cause a conversion error. This means that the <page-encoding> configuration and Pageencoding properties are actually equivalent.
- If the page directive does not have the Pageencoding property and the <page-encoding> configuration, but has the ContentType property, the charset in the ContentType attribute is eventually taken. If the ContentType also has no CharSet, the iso-8859-1 is used by default.
- If there are pageencoding or <PAGE-ENCODING>, their precedence is higher than the charset of the ContentType attribute.
- An exception to this is the discovery of a byte order mark (BOM) in a file, where the BOM is the equivalent of <page-encoding>. If the character encoding of the BOM map differs from the CharSet in Pageencoding or contenttype, the conversion error will result.
Summary: When determining pageencoding, the BOM has the highest priority, followed by pageencoding and <page-encoding>, then the ContentType of the page directive, and finally the iso-8859-1.
Response encoding
The encoding of the response is essentially determined by the Characterencoding property of the Servletresponse object.
The ContentType property of the page directive is used to set the charsetencoding of the servletresponse. If the contenttype charset is not specified, there are two cases:
- If you are using the XML syntax of the document, the default is UTF-8;
- Files that use JSP syntax depend on bom,pageencoding or <page-encoding>.
The encoding of the response is determined only by the requested page, not by the page included with the include directive.
Summary : In determining the response encoding, the page instruction contenttype in the CharSet priority, followed by the BOM, then Pageencoding and <page-encoding> Finally, Iso-8859-1.
In addition, the encoding of the response is affected by three methods, Setcharacterencoding, setContentType, and setlocale. SetLocale has the lowest priority.
The type value of the ContentType property, if not specified, is text/html by default.
Summary : If only the contentTypeof page instruction is set, it iseasy to ensure the unity of the encoding and response encoding .
The difference between the ContentType of the page directive and the content-type of the <meta> element
In the response result, the ContentType of the page directive affects the HTTP response header Content-type. The content-type in the <meta> element is just a piece of text, and the browser detects the text only if there is no content-type in the HTTP response header. This means that the priority of the HTTP response header Content-type in the browser is higher than the Content-type attribute of the <meta> element.
Get/post Request Encoding
In front of only the page encoding and response encoding, then the request data encoding is determined by what?
Different browsers may use different character encodings for URLs, and the Chinese version of the browser will generally use GBK. In order to stay unified, many Web sites are using the URL of the Chinese and special characters with the JavaScript URL encode.
After testing, the encoding of Get and POST request data is only affected by the content-type of the html<meta> element in the current browser.
Get/post read of request data:
- Reading the data passed to the servlet via post requires the correct characterencoding to be set for ServletRequest to be read.
- Reads data passed to the servlet via get, which is read correctly only if the uriencoding of the connection element in the Tomcat configuration file server.xml the same encoding as the data, or transcoding is required.
why get and post are different explanations : Any request needs to be processed by the server first, and the server reads the URL before it knows what servlet to pass. Both the post and get URLs are affected by the server configuration uriencoding, but after the post has been read successfully through the URL, its data is not in the URL, so setting the ServletRequest's characterencoding can be read correctly.
JSP in XML view format
JSP pages in XML syntax format, whose page character encoding and corresponding character encoding are always UTF-8.
JSP page, response and request encoding full solution