Solve the issue of garbled code passing through JSP parameters (1)

Source: Internet
Author: User

Computer was born in the United States. English is his mother tongue, and other languages except English are foreign languages for him. Like us, He doesn't use a foreign language as well as his mother tongue no matter how much he has mastered. He often has some spelling mistakes.

The root cause of garbled characters is that different encoding schemes are used for encoding and decoding. For example, the GBK-encoded file, the result of decoding with a UTF-8 is certainly a Mars. To solve this problem, the central idea is to use a unified coding solution.

The parameters between jsp pages are transmitted in the following ways: 1. form) submission. 2. directly use the URL followed by parameters (Hyperlink ). 3. If two jsp pages are in two different windows and the two windows are parent-child, jsp in the subwindow can also use javascript and DOMwindow. opener. XXX. value) to obtain the value of the jsp input element in the parent window. The following describes the garbled problem in the first two methods.

1. form) Submission implements parameter transfer between pages

Before introducing the content of parameters passed in a form, you should first learn some preparation knowledge. The form submission method and the processing of Chinese characters in the request message.

Form submission method:

The form submission method is usually post or get. The difference between the two is that the post method places the data content in the request's data body without the length limit, and the get method directly follows the URL in the request header, there is a length limit. The following is a request message for the same page.

Requesttest. jsp code

 
 
  1. <% @ Page language = "java" contentType = "text/html; charset = UTF-8"
  2. PageEncoding = "UTF-8" %>
  3. <! DOCTYPE html PUBLIC "-// W3C // dtd html 4.01 Transitional // EN" "http://www.w3.org/TR/html4/loose.dtd">
  4. <Html>
  5. <Head>
  6. <Meta http-equiv = "Content-Type" content = "text/html; charset = UTF-8">
  7. <Title> Insert title here </title>
  8. </Head>
  9. <Body>
  10. <% -- Submit a form in post mode -- %>
  11. <Form action = "http: // localhost: 8888/EncodingTest/requestresult. jsp" method = "post">
  12. UserName: <input type = "text" name = "username"/>
  13. Password: <input type = "password" name = "password"/>
  14. <Input type = "submit" value = "Submit">
  15. </Form>
  16. </Body>
  17. </Html>
 
 
  1. <% @ Page language = "java" contentType = "text/html; charset = UTF-8" pageEncoding = "UTF-8" %> <! DOCTYPE html PUBLIC "-// W3C // dtd html 4.01 Transitional // EN" "http://www.w3.org/TR/html4/loose.dtd">

In the username input box on the above request page, the three Chinese characters "World Cup" are entered. In the password input box, enter "123" and press the Submit button to Submit the request. The intercepted request packets are as follows:

Request message code in Post Mode

 
 
  1. POST /EncodingTest/requestresult.jsp HTTP/1.1   
  2. Accept: image/gif, image/jpeg, image/pjpeg, image/pjpeg, application/x-shockwave-flash, application/vnd.ms-excel, application/vnd.ms-powerpoint, application/msword, */*    
  3. Referer: http://localhost:8080/TomcatJndiTest/requesttest.jsp    
  4. Accept-Language: zh-cn    
  5. User-Agent: Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; CIBA; aff-kingsoft-ciba; .NET CLR 2.0.50727)    
  6. Content-Type: application/x-www-form-urlencoded    
  7. Accept-Encoding: gzip, deflate    
  8. Host: localhost:8888   
  9. Content-Length: 49   
  10. Connection: Keep-Alive    
  11. Cache-Control: no-cache    
  12.    
  13. username=%E4%B8%96%E7%95%8C%E6%9D%AF&password=123   
  14. POST /EncodingTest/requestresult.jsp HTTP/1.1 Accept: image/gif, image/jpeg, image/pjpeg, image/pjpeg, application/x-shockwave-flash, application/vnd.ms-excel, application/vnd.ms-powerpoint, application/msword, */* Referer: http://localhost:8080/TomcatJndiTest/requesttest.jsp Accept-Language: zh-cn User-Agent: Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; CIBA; aff-kingsoft-ciba; .NET CLR 2.0.50727) Content-Type: application/x-www-form-urlencoded Accept-Encoding: gzip, deflate Host: localhost:8888 Content-Length: 49 Connection: Keep-Alive Cache-Control: no-cache username=%E4%B8%96%E7%95%8C%E6%9D%AF&password=123  

The above message content shows that the post request message has a dedicated data department .,

The following request message for get submission on the same request page:

Get request message code

 
 
  1. GET /EncodingTest/requestresult.jsp?username=%E4%B8%96%E7%95%8C%E6%9D%AF&password=123 HTTP/1.1   
  2. Accept: image/gif, image/jpeg, image/pjpeg, image/pjpeg, application/x-shockwave-flash, application/vnd.ms-excel, application/vnd.ms-powerpoint, application/msword, */*    
  3. Referer: http://localhost:8080/TomcatJndiTest/requesttest.jsp    
  4. Accept-Language: zh-cn    
  5. User-Agent: Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; CIBA; aff-kingsoft-ciba; .NET CLR 2.0.50727)    
  6. Accept-Encoding: gzip, deflate    
  7. Host: localhost:8888   
  8. Connection: Keep-Alive   
  9. GET /EncodingTest/requestresult.jsp?username=%E4%B8%96%E7%95%8C%E6%9D%AF&password=123 HTTP/1.1 Accept: image/gif, image/jpeg, image/pjpeg, image/pjpeg, application/x-shockwave-flash, application/vnd.ms-excel, application/vnd.ms-powerpoint, application/msword, */* Referer: http://localhost:8080/TomcatJndiTest/requesttest.jsp Accept-Language: zh-cn User-Agent: Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; CIBA; aff-kingsoft-ciba; .NET CLR 2.0.50727) Accept-Encoding: gzip, deflate Host: localhost:8888 Connection: Keep-Alive  

The above message content shows that the get request message does not have a special data department, and the data is directly following the url.

Processing of Chinese characters in the Request Message:

From the preceding two types of messages, we can see that the three Chinese characters "World Cup" entered on the page have been replaced with one such as "% E4 % B8 % 96% E7 % 95% 8C % E6 % 9D % AF ". string, then it is sent to the server. There may be two problems: Question 1: What is this string? Question 2: Why is this replacement necessary?

This string is the "UTF-8" encoding "E4B896E7958CE69DAF" corresponding to the "World Cup", which is formed by appending a "%" before each byte. As to why we need to do this conversion, my understanding is: because the request message will be encoded in the "ISO-8859-1" encoding method, transmitted to the server through the way of network flow. "ISO-8859-1" only supports numbers, English letters and some special characters, so such as Chinese characters such as "ISO-8859-1" is not recognized. So it is necessary to give these "ISO-8859-1" does not support the character a "plastic" operation. In this way, the information on the page can be correctly transmitted to the server.

At this time there may be another question: in the above example, why choose "UTF-8" encoding, Other encoding scheme can? The answer is yes. There is such a code in the header of the jsp page code "<% @ page language =" java "contentType =" text/html; charset = UTF-8 "pageEncoding =" UTF-8 "%>" the value of charset is the character set used by the browser to perform an "integer" Operation on the request message before submitting the request message, it is also the character set used by the browser to interpret the server's response page.

After learning about the above content, I began to analyze the garbled characters of parameters passed in the form method.

For example, after clicking the "Submit" button, the browser sends the request message after the "shaping" operation to the Servlet container on the WEB server. After receiving the request message, the container, the request message will be parsed and an HttpServletRequest object will be generated with the information of this packet, and then the HttpServletRequest object will be sent to the jsp or Servlet requested on this page (in the above example, "requestresult. jsp "). In the requested jsp or Servlet ("requestresult. jsp" in the above example), use the getParameter ("") method of the HttpServletRequest object to obtain parameters from the previous page. By default, this method uses "ISO-8859-1" to decode, so the value of parameters for English or numbers can be obtained correctly, but for Chinese characters such as the character is not available, these Chinese characters have been "orthopedic" and cannot be recognized. If you want to recognize them again, you have to find the surgeon for the procedure, and then perform a "Restore" operation. The following solutions can be used in different situations.


Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.