When submitting a webpage form (form), you can select two submission methods: get and post. Most of us know that post should be used when submitting form data, and that the get method is not safe. What makes the difference between get and post? This article will analyze the root cause of the difference between get and post at the http protocol level.
Let's take a look at a simple http message on the hello woeld page:
The http protocol message sent by the browser requesting "http: // www.2cto.com/test/hello.html:
- GET/test/hello.html HTTP/1.1
- Host: 127.0.0.1: 80
- User-Agent: Mozilla/5.0 (X11; U; Linux i686; zh-CN; rv: 1.8.1.11) Gecko/20071204 Ubuntu/7.10 (gutsy) Firefox/2.0.0.11
- Accept: text/xml, application/xml, application/xhtml + xml, text/html; q = 0.9, text/plain; q = 0.8, image/png ,*/*; q = 0.5
- Accept-Language: zh-cn, zh; q = 0.5
- Accept-Encoding: gzip, deflate
- Accept-Charset: gb2312, UTF-8; q = 0.7, *; q = 0.7
- Keep-Alive: 300
- Connection: keep-alive
To capture http packets, you need to use a tool Program (here I use a self-compiled TCP proxy program to intercept http messages ). Line 3 of the request protocol message "GET" indicates the submission method. "/test/hello.html" indicates the request's author and path, and "HTTP/1st" indicates the http protocol version used. Line 2nd indicates the requested Server IP address and port. Note that "80" indicates the port number added by default. The following lines describe some information about the client system and browser.
WEB Server Response Message:
- HTTP/1.1 200 OK
- Server: Apache-Coyote/1.1
- Content-Type: text/html; charset = GBK
- Content-Length: 145
- Date: Wed, 26 Dec 2007 18:00:59 GMT
-
- <! -HTML webpage file instance->
- <Html>
- <Head>
- <Title> Hello, world! </Title>
- </Head>
- <Body>
- <H3> Hello world! </H3>
- </Body>
- </Html>
"200 OK" indicates the service response code, indicating success. Common examples include "404" and "500. "Content-Length: 145" indicates the Length of the response webpage. "Date: Wed, 26 Dec 2007 18:00:59 GMT" indicates the response time. A blank line is followed by the webpage body.
For http Communication, the client (browser) always sends a request like above, and the server (web Server) replies, and the form is also submitted. The following analyzes a simple form.
Form submission
1. Submit in get mode:
- <Form name = "form1" method = "get" action = "result. jsp">
- Name: <input type = "text" name = "userName" value = "> <br>
- Password: <input type = "password" name = "password" value = "> <br>
- Gender: <input type = "radio" name = "sex" value = "m"> male <input type = "radio" name = "sex" value = "f"> female <br>
- Hobbies: <input type = "checkbox" name = "interest" value = "dance"> dancing
- <Input type = "checkbox" name = "interest" value = "sing"> singing
- <Input type = "checkbox" name = "interest" value = "basketball"> basketball <br>
- <Br> <input type = "submit" name = "submit" value = "submit">
- </Form>
Note that in the <form> label, method = "get ".
Http request protocol message:
- GET/get_post/result. jsp? UserName = lisi & password = 1111 & sex = f & interest = dance & interest = sing & submit = % CC % E1 % BD % bb http/1.1
- Host: 127.0.0.1: 8090
- User-Agent: Mozilla/5.0 (X11; U; Linux i686; zh-CN; rv: 1.8.1.11) Gecko/20071204 Ubuntu/7.10 (gutsy) Firefox/2.0.0.11
- Accept: text/xml, application/xml, application/xhtml + xml, text/html; q = 0.9, text/plain; q = 0.8, image/png ,*/*; q = 0.5
- Accept-Language: zh-cn, zh; q = 0.5
- Accept-Encoding: gzip, deflate
- Accept-Charset: gb2312, UTF-8; q = 0.7, *; q = 0.7
- Keep-Alive: 300
- Connection: keep-alive
The data submitted by the get form is displayed in the first line of the Request http protocol, and the request address is "?" Interval (this line will also be displayed in the browser address bar ). The data submitted by the form can be analyzed from "userName = lisi & password = 1111 & sex = f & interest = dance & interest = sing & submit = % CC % E1 % BD % BB ":
■ UserName = lisi name the text box is "lisi"
■ Password = 1111 the text box "111" is entered"
■ Sex = f the Gender radio button selects "female"
■ Interest = dance & interest = sing multiple choice box selected two: dancing and singing
■ Submit = % CC % E1 % BD % BB click the submit button to submit (the submit button is also a form element and will also be submitted to the server ), "% CC % E1 % BD % BB" is the value attribute of the button "Submit" two Chinese characters in gb2312 16-bit encoding.
Next let's take a look at the difference between the post method and get method.
2. post submission:
Change method = "get" in <form> to method = "post" and submit the same data. The http Request Protocol packet is as follows:
- POST/get_post/result. jsp HTTP/1.1
- Host: 127.0.0.1: 8090
- User-Agent: Mozilla/5.0 (X11; U; Linux i686; zh-CN; rv: 1.8.1.11) Gecko/20071204 Ubuntu/7.10 (gutsy) Firefox/2.0.0.11
- Accept: text/xml, application/xml, application/xhtml + xml, text/html; q = 0.9, text/plain; q = 0.8, image/png ,*/*; q = 0.5
- Accept-Language: zh-cn, zh; q = 0.5
- Accept-Encoding: gzip, deflate
- Accept-Charset: gb2312, UTF-8; q = 0.7, *; q = 0.7
- Keep-Alive: 300
- Connection: keep-alive
- Content-Type: application/x-www-form-urlencoded
- Content-Length: 82
-
- UserName = lisi & password = 1111 & sex = f & interest = dance & interest = sing & submit = % CC % E1 % BD % BB
Differences
Pay attention to the following points when comparing get and post requests to http packets:
■ Starting from the first line of the protocol, the get method is declared as "GET", and the post method is declared as "POST ".
■ Submit the data "userName = lisi & password = 1111 & sex = f & interest = dance & interest = sing & submit = % CC % E1 % BD % BB ", for the get method in the first row and the post method in the last row, the two methods are completely consistent with the form data encoding.
■ There are three more rows in the post mode: Content-Type: application/x-www-form-urlencoded, Content-Length: 82, and empty rows before the data.
From the above points alone, it is hard to imagine what is the substantial difference between get and post. Why is the difference so big in use? Just as the "high iron content in spinach" was mistakenly transferred for more than ten years, until someone proved that the time when the iron was calculated was wrong with the decimal point.
There is only one substantive difference: "Content-Length: 82 ", in post, it indicates that the submitted data "userName = lisi & password = 1111 & sex = f & interest = dance & interest = sing & submit = % CC % E1 % BD % BB "length, but not get.
As a result, the server is prone to a security vulnerability when receiving data submitted by get: Buffer Overflow.
Buffer Overflow
■ Glossary [Buffer Overflow]: By writing content beyond its length to the buffer zone of the program, the buffer overflow is caused, which destroys the stack of the program and causes the program to execute other commands, to achieve the purpose of the attack.
When receiving data submitted by the client form, the server program first needs to store the data to a memory space, and then perform parsing and other subsequent work. This memory space is generally called the receiving buffer. Because the post data is marked with Content-Length, the server can create a buffer equal to or slightly greater than the submitted data according to the tag Length. For get data, you do not know the amount of submitted data in advance, you need to estimate the buffer length. If the buffer is large and the received data is small, it will cause a waste of memory. If the buffer is smaller than the received data, it may cause a buffer overflow.
Buffer Overflow
"Smart" hackers place special code in the overflow part to attack your server.
The modern WWW server is not so weak, but it is difficult to completely and effectively solve the buffer overflow vulnerability. The operating system and C language programs provide a breeding ground for this problem, so far, a considerable number of WWW Service software have this vulnerability. You can search for "get buffer overflow.
This is why the get method is not recommended for submitting form data.
Learn with Deep Thinking
The http entry books have different descriptions of get and post, but they are often just saying it, not saying so, while young programmers often have a good memory, but ignore the more important "thinking ".
Another purpose of this article is about the learning method: To get used to deep thinking, to learn with an attitude of doubt, exploration, and proof, even if it is widely considered as an "principle ".