HTTP is an object-oriented protocol belonging to the application layer, which is suitable for distributed hypermedia information System because of its simple and fast way. It was proposed in 1990, after several years of use and development, has been continuously improved and expanded. Currently used in the WWW is the sixth edition of Http/1.0, http/1.1 standardization work is in progress, and Http-ng (Next Generation of HTTP) has been proposed.
The main features of the HTTP protocol can be summarized as follows:
1. Support client/server mode.
2. Simple and fast: When a customer requests a service from the server, it simply transmits the request method and path. The request method commonly has, POST. Each method specifies a different type of contact between the customer and the server. Because the HTTP protocol is simple, the HTTP server's program size is small, so the communication speed is fast.
3. Flexible: HTTP allows the transfer of any type of data object. The type being transmitted is marked by Content-type.
4. No connection: The meaning of no connection is to limit the processing of only one request per connection. When the server finishes processing the customer's request and receives the customer's answer, the connection is disconnected. In this way, the transmission time can be saved.
5. Stateless: The HTTP protocol is a stateless protocol. Stateless means that the protocol has no memory capacity for transactional processing. A lack of state means that if the previous information is required for subsequent processing, it must be re-routed, which may cause the amount of data to be transferred per connection to increase. On the other hand, it responds faster when the server does not need the previous information.
HTTP (Hypertext Transfer Protocol) is a request-and-response mode-based, stateless, application-level protocol, often based on TCP connection, HTTP1.1 version of a continuous connection mechanism, the vast majority of web development, is built on the HTTP protocol on the Web application.
a simple process for a browser to access a server:
We give a simple example, such as when we enter http://www.baidu.com in the address bar of the browser, will appear Baidu's homepage; press F12 to view the request and response;
second, the HTTP protocol data packet crawl
If you need to crawl the HTTP protocol packets, you can use HttpWatch or fiddler
HttpWatch supports IE and Firefox
Fiddler supports a variety of browsers because it crawls all of the HTTP protocol packets
We use Fiddler to grab HTTP requests and responses
third, use Fiddler to crawl our own simple page written
Request messages captured using Fiddler using fiddler-Captured response messages
Iv. Explanation of HTTP protocol1. A typical request:
POST Http://localhost:8080/Servlet02/login http/1.1---> Request first line host:localhost:8080---> Request header
Connection:keep-alive
Content-length:28
Cache-control:max-age=0
origin:http://localhost:8080
Upgrade-insecure-requests:1
user-agent:mozilla/5.0 (Windows NT 6.1; WOW64) applewebkit/537.36 (khtml, like Gecko) chrome/56.0.2924.87 safari/537.36
content-type:application/x-www-form-urlencoded
accept:text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Referer:http://localhost:8080/servlet02/login.html
Accept-encoding:gzip, deflate, BR
accept-language:zh-cn,zh;q=0.8
---> Request blank line
username=shelley&userpwd=123---> Request body
2. Requested part of request:
Request first line (must have): POST http://localhost:8080/Servlet02/login http/1.1
|--methods: GET, POST, HEAD, DELETE, PUT, TRACE, OPTIONS
|--Request URL
|--protocol name/version number
Request headers (generally available):
host:localhost:8080//server to access: Server Name: Port number
Connection:keep-alive//Stay Connected, is http/1.1 version specific, 1.0 version each time need to establish a connection, 1.1 is the optimization of 1.0
CONTENT-LENGTH:28//refers to the length of the request body (byte)
user-agent:mozilla/5.0 (Windows NT 6.1; WOW64) applewebkit/537.36 (khtml, like Gecko) chrome/56.0.2924.87 safari/537.36 Client-side information
content-type:application/x-www-form-urlencoded//Request body type
accept:text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8//Can receive response body type: MIME
Referer:http://localhost:8080/servlet02/login.html//Indicates which path the request originated from (source statistics, prevent hotlinking)
Accept-encoding:gzip, deflate, BR
accept-language:zh-cn,zh;q=0.8
Request blank line: Separates the function of the request header and the request body. Because the number of request headers per request may be different, such a blank line is required to identify the end of the request header
Request body: Only the POST request will have the request body. In the case of a Get method, the parameters are placed in the query string.
Username=shelley&userpwd=123
The difference between get and post: The get parameter is submitted as the Q uerystring of the URL, and the post parameter is submitted as the request body
v. Deeper interpretation of the HTTP protocol
request first line, request blank line and request BodyGET: Request method
Http://localhost ... : Request Path
http/1.1: Request protocol and version, the difference between 1.1 and 1.0 is that 1.0 requests one connection at a time, 1.1 can connect multiple times, the default remains 3000ms
Request blank line: Empty line between request header and request body
Request body: If the request is a post then there is the request body, the body contains the parameters of the request, if the request is a get there is no body, parameters can be passed through the URL
detailed request Header:Host: request hostname and port number
Connection: Keep the connection, which is related to the protocol version of HTTP, if 1.0, is not.
Content-length: Body length
User-agent: User agent, refers to the user's use of the machine and browser information, if you have seen the display of your system and browser information pictures, even your location, the weather forecast, yes, that is, and access to IP to make.
Content-type: Form data type.
Accept: Accepts a resource type, can have multiple, has priority.
Referer: Request source site, can be used in search statistics, anti-theft chain and so on.
Accept-encoding: Represents an acceptable type of compression, when early transmission is slow, the text compression rate is high, usually compression.
Accept-language: Receive language type.
Cookies: Information about the Web server that is stored on the client. Related to information disclosure.
Respond to the first line, the response blank line, the response bodyhttp/1.1: Protocol Number/version
OK: Status code description. Common response Status Code: status Code, Description:
200: Success
404: Resource Not Found
500: Server Error
302: Redirect
304: Cache fetching
Response blank line: Split action
Response Body: HTML document for Web pages
The response header is detailed:Server: Servers information
Accept-ranges: Breakpoint Download related properties, bytes if there is a value, start the download from the corresponding location
Etag:url's entity Tag, which is used to indicate whether the URL object has changed, distinguishes between different languages and sessions, etc. The specific internal meaning is to make the server control, just like a cookie.
LastModified: Last modified time, associated with cache, associated with return code 304
Content-type: Response body Type
Content-length: Body length
Date: Return time
Expires:-1/cacje-control:no-cache/pragma:no-cache: Not cached, there are multiple reasons because different browser settings are different.
Refresh: Auto Refresh response, various ads.
Set-cookie: Writes a Cookie to the client.
HTTP protocol Related Technical supplement
1. Foundation:
High-level protocols include: File Transfer Protocol FTP, e-Mail Transfer Protocol SMTP, Domain Name System service DNS, Network News Transfer Protocol NNTP and HTTP protocol, etc.
Mediation consists of three types: proxy, gateway, and channel (tunnel), an agent accepts requests based on the absolute format of the URI, rewrites all or part of the message, and sends the formatted request to the server through the identity of the URI. The gateway is a receiving agent that acts as the upper layer of some other servers and, if necessary, translates the request to the underlying server protocol. A channel acts as a relay point between two connections that do not change the message. The channel is often used when the communication needs to pass through an intermediary (for example, a firewall, etc.) or if the content of the message is not recognized by the intermediary.
Proxy: An intermediary program that can act as a server or as a client to establish requests for other clients. Requests are either internally or passed to other servers through possible translations. An agent must interpret and overwrite it if possible before sending the request information. Proxies are often used as portals through the firewall's client side, and proxies can be used as a help app to handle requests that are not completed by the user agent through the protocol.
Gateway: A server that acts as an intermediary for other servers. Unlike the proxy, the gateway accepts the request as if it were the source server for the requested resource, and the requesting client is unaware that it is dealing with the gateway.
Gateways are often used as server-side portals through firewalls, and gateways can be used as a protocol translator to access resources stored in non-HTTP systems.
Channel (tunnel): is a broker that is a relay of two connections. Once activated, the channel is considered not to be an HTTP communication, although the channel may be initialized by an HTTP request. The channel disappears when both ends of the relayed connection are closed. A channel is often used when a portal must exist or the intermediary (intermediary) cannot interpret the relay's traffic.
2. Advantages of Protocol Analysis-http Analyzer detects network attacks
The analysis and processing of high-level protocols in a modular manner will be the direction of future intrusion detection.
Common ports 80, 3128, and 8080 for HTTP and its proxies are specified in the network section with the port tag.
3. The HTTP protocol content lenth limit vulnerability causes a denial of service attack
When using the Post method, you can set Contentlenth to define the length of the data that needs to be transferred, such as contentlenth:999999999, which is not released until the transfer is complete, and the attacker can exploit this flaw Continuously sends spam data to the Web server until the Web server runs out of memory. This method of attack does not leave a trace.
Http://www.cnpaf.net/Class/HTTP/0532918532667330.html
4. Some ideas for denial-of-service attacks using the features of the HTTP protocol
The server is busy processing an attacker's bogus TCP connection request, ignoring the customer's normal request (after all, the client's normal request rate is very small), at this point, from the normal customer's point of view, the server is unresponsive, this situation we call: The server side was Synflood attack (SYN flood attack).
Smurf, teardrop and so on are using ICMP packets to flood and IP fragment attacks. This article uses a "normal connection" method to generate a denial of service attack.
19 ports in the early days already someone used to do chargen attacks, namely Chargen_denial_of_service, but! They use the method is to create a UDP connection between the two Chargen servers, so that the server processing too much information and down, then, to kill a Web server must have 2 conditions: 1. Chargen Service 2. HTTP Service
Method: The attacker forged the source IP to send a connection request (connect) to n Chargen, and after receiving the connection, the Chargen will return a stream of 72 bytes per second (in fact, this speed is faster than the actual network) to the server.
5. HTTP Fingerprint recognition technology
The principle of HTTP fingerprint recognition is basically the same: record different servers to identify the minor differences in HTTP protocol execution. HTTP fingerprinting is much more complex than TCP/IP stack fingerprinting, because of custom HTTP server configuration files, Adding plug-ins or components makes it easy to change the response of HTTP, which makes identification difficult, whereas customizing the TCP/IP stack requires modifying the core layer so it is easy to identify.
To make the server return different banner information settings is very simple, such as Apache, open source HTTP server, the user can modify the banner information in the source code, and then restart the HTTP service to take effect For HTTP servers that do not have public source code, such as Microsoft IIS or Netscape, can be modified in the DLL file where the banner information is stored, the relevant articles are discussed, and are not described here. Of course, the effect of such a modification is good. Another way to blur banner information is to use a plugin.
Common Test requests:
1:head/http/1.0 sending a basic Http request
2:delete/http/1.0 send requests that are not allowed, such as DELETE requests
3:get/http/3.0 sending an illegal version of the HTTP protocol request
4:get/junk/1.0 sending an incorrect specification of an HTTP protocol request
HTTP fingerprint Identification Tool Httprint, it can effectively determine the type of HTTP server by using the principle of statistics and combining fuzzy logic technology. It can be used to collect and analyze signatures generated by different HTTP servers.
6, Other: In order to improve the user's performance when using the browser, the modern browser also supports concurrent access, browse a Web page while establishing multiple connections, to quickly obtain a number of icons on a Web page, so that the entire page can be faster to complete the transmission.
This continuous connection is provided in HTTP1.1, while the next-generation HTTP protocol: Http-ng adds support for session control, rich content negotiation, and more to provide
more efficient connections.
HTTP protocol Detailed