Java Web garbled analysis and solution

Source: Internet
Author: User

1. What is URL encoding?

URL encoding is a format that the browser uses to package form input, and the browser obtains all of the name and its corresponding value from the form, sending them as part of the URL as a name/value encoding or separating it to the server.

2. URL encoding rules.

Each pair of Name/value is separated by &, and each pair of name/value from the form is separated by =. If the user does not enter a value, the name will still appear but there is no value.

URL encoding is preceded by the hexadecimal number of the ASCII code of the character plus%. For example \ (her hexadecimal number is represented as 5c) the URL encoding is%5c.

3. Simple description of garbled and HTTP requests

In fact, do web development garbled problem is often appear, with the above code based on the following to see garbled.

1) garbled problem is often encountered in the web development process, the main reason is that the use of non-ASCII code in the URL causes the server background program parsing garbled problem.

2) The easiest place to appear in the URL is in the QueryString parameter value and Servletpath.

3) Simply use a diagram to illustrate the flow of HTTP requests:

The first step: the browser to encode the URL to the server;

The second step: the server will decode these requests after processing the display of the contents of the code sent to the client browser;

Step three: The browser displays the webpage according to the specified encoding

POST request

Detailed analysis of how post submissions are encoded and how the server decodes and garbled solutions

For post, the parameter value pairs in the form are sent to the server via the request packets, at which point the browser will be based on the ContentType of the page ("text/html; CHARSET=GBK ") encodes the data in the form and then sends it to the server.

In the server-side program we can

Request.setcharacterencoding () sets the encoding and then passes the

Request.getparameter get the right data.

Garbled here can be directly solved by request.setcharacterencoding ().

GET request

For the Get method, we know that its commit is to append the request data to the URL as a parameter, so the dependency garbled will be easy to appear, because the data name and value is likely to be passed as non-ASCII code.

When the URL is spliced, the browser encode it and then sends it to the server. See URL encoding rules for specific rules.

Here in detail the process of encode prone to problems, in this process we have to understand the need for URL encode characters are generally non-ASCII characters, so we can know that garbled main URL is appended with Chinese or special characters made, another to know the URL Encode exactly what encoding to encode the character, in fact, this encoding method is determined by the browser, different browsers and the same browser different settings affect the URL encoding, so in order to avoid the code we do not need, We can control it uniformly through Java code or javaspcript code.

After the URL encode is completed, the URL becomes a character in the ASCII range, and then it is converted to binary with the ISO-8859-1 encoding to send it along with the request header.

After the server, the first server will be decoded with iso-8859-1, the server gets the data is the ASCII range of the request header characters, where the request URL with parameter data, if it is a central defender or special characters, then the encode after the% XY (hexadecimal number in the encoding rule) is not working by request.setcharacterencoding (). At this time we can find the root cause of garbled is that the client is generally by using UTF-8 or GBK data to encode, to the server but iso-8859-1 way decoder obviously not.

There are two ways to solve this problem.

Typically, our requests are sent to the Web container first (as in Tomcat below), the URL is decoded by the Web container, and for the Tomcat container we can conf/ The URL decoding parameter is added to the Server.xml Connector tag, and the default container uses iso-8859-1 decoding of the URL.

[HTML]View PlainCopy
    1. <Connector port="8080" protocol="http/1.1 "
    2. connectiontimeout="20000"
    3. redirectport="8443" />

above is the default settings for Tomcat, you can add the Uriencoding property to the tag to specify the URL decoding scheme. (PS: Tag notation is URI not URL)

If you do not want to use this hard-decoding scheme, you can also specify another property: Usebodyencodingforuri, which is used to tell the Web container if the request specifies a decoding scheme, The URL is decoded using the encoding specified by request.setcharacterencoding.

The second scenario has not been tested and can be tried if necessary. For more information, refer to the Tomcat official documentation below:

Http://wiki.apache.org/tomcat/FAQ/CharacterEncoding#Q2

In addition, if you do not want to modify the container's global configuration, after all, sometimes the container may not be more than one of our applications, then we can also use the following procedure to extract parameters :

[Java]View PlainCopy
    1. String path = Req.getserverpath (); //manual extraction, not suitable for mating frame
    2. Path = new String (Path.getbytes ("iso8859-1", "utf-8")); Re-assemble


The above approach, we want to determine the Web container to decode the URL is iso8859-1, because do not exclude others modify the container configuration or the container configuration itself is more wonderful possibilities.

Java Web garbled analysis and solution

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.