HTTP protocol details

Source: Internet
Author: User
Tags response code domain server file transfer protocol server memory microsoft iis

HTTP is an object-oriented protocol at the application layer. It is applicable to distributed hypermedia information systems due to its simple and fast method.

System. It proposed in 1990 that, after several years of use and development, it has been continuously improved and expanded. Currently, the sixth version of HTTP/1.0 is used in WWW, standardization of HTTP/1.1 is in progress, and suggestions for HTTP-NG (Next Generation of HTTP) have been put forward.

The main features of HTTP are as follows:

1. Supports the customer/Server mode.

2. simple and fast: when a customer requests a service from the server, they only need to send the request method and path. Common Request methods include get, Head,

Post. Each method specifies the type of contact between the customer and the server. Because the HTTP protocol is simple, the program size of the HTTP server is small, so the communication speed is fast.

3. Flexibility: HTTP allows transmission of any type of data objects. The type being transferred is marked by Content-Type.

4. No connection: No connection means that only one request is allowed for each connection. The server processes the customer's request and receives

After the response, the connection is closed. This method can save transmission time.

5. Stateless: HTTP is stateless. Stateless means that the Protocol has no memory for transaction processing. Missing status means

If the preceding information is required for subsequent processing, it must be re-transmitted, which may increase the data volume transferred each time. On the other hand, when the server does not need previous information, its response is faster.

I. url for HTTP protocol explanation

HTTP (Hypertext Transfer Protocol) is a stateless, application-layer protocol based on the request and response mode.

Connection method. In http1.1, a persistent connection mechanism is provided. Most Web developers are built on the HTTP protocol.

.

The format of http url (a URL is a special type of URI that contains sufficient information for searching a resource) is as follows:

Http: // host [":" port] [abs_path]

HTTP indicates that network resources are to be located through the HTTP protocol; host indicates a valid Internet host domain name or IP address;

Port specifies a port number. If it is null, the default port 80 is used. abs_path specifies the uri of the request resource.

When a request URI is generated, it must be given in the form of "/". Generally, this job is automatically completed by the browser.

Eg:

1. Enter www.guet.edu.cn

The browser automatically converts to: http://www.guet.edu.cn/

2. http: 192.168.0.116: 8080/index. jsp

Ii. http protocol details

An HTTP request consists of three parts: request line, message header, and request body.

1. The request line starts with a method symbol and is separated by spaces, followed by the request URI and Protocol version. The format is as follows: Method Request-Uri http-version CRLF

The method indicates the request method, the request-Uri is a unified resource identifier, and the http-version indicates the request's

HTTP Protocol version; CRLF indicates carriage return and line feed (except as the ending CRLF, separate CR or lf characters are not allowed ).

There are multiple request methods (all methods are capitalized). The methods are described as follows:

GET request to get the resource identified by request-Uri

Post attaches new data to the resource identified by request-Uri

Head request to obtain the Response Message Header of the resource identified by request-Uri

The put request server stores a resource and uses request-Uri as its identifier.

The Delete request server deletes the resource identified by request-Uri.

Trace Request information received by the server for testing or diagnosis

Connect reserved for future use

Options requests query server performance, or query resource-related options and requirements

Example:

Get method: when you enter a URL in the address bar of the browser to access the webpage, the browser uses the get method to obtain resources from the server,

Eg: Get/form.html HTTP/1.1 (CRLF)

The post method requires the request server to accept the data attached to the request. It is often used to submit forms.

Eg: Post/Reg. jsp HTTP/(CRLF)

Accept: image/GIF, image/X-xbit,... (CRLF)

...

HOST: www.guet.edu.cn (CRLF)

Content-Length: 22 (CRLF)

Connection: keep-alive (CRLF)

Cache-control: No-Cache (CRLF)

(CRLF) // This CRLF indicates that the message header has ended and is previously the message header.

User = Jeffrey & Pwd = 1234 // the data submitted below this row

The head method is almost the same as the get method. For the response part of the head request, its HTTP header contains

Information is the same as the information obtained through the GET request. Using this method, you don't have to transmit the entire resource content, you can get

The information of the resource identified by request-Uri. This method is often used to test the validity, accessibility, and recent availability of hyperlinks.

Update.

2. Post-Request Header

3. Request body (omitted)

Iii. Response to HTTP protocol details

After receiving and interpreting the request message, the server returns an HTTP Response Message.

HTTP response is composed of three parts: Status line, message header, and response body.

1. The status line format is as follows:

HTTP-version status-code reason-phrase CRLF

HTTP-version indicates the HTTP protocol version of the server, and status-code indicates the response status generation sent back by the server.

Reason-phrase indicates the text description of the status code.

The status code consists of three numbers. The first number defines the response category and has five possible values:

1xx: indicates that the request has been received and continues to be processed.

2XX: Success-indicates that the request has been successfully received, understood, and accepted

3xx: Redirection-further operations are required to complete the request

4xx: client error-the request has a syntax error or the request cannot be implemented

5xx: Server Error -- the server fails to fulfill the valid request

Common status codes, status descriptions, and descriptions:

200 OK // client request successful

400 bad request // The client request has a syntax error and cannot be understood by the server

401 unauthorized // request for unauthorized authorization. This status code must be reported with www-authenticate.

// Use the header domain together

403 Forbidden // The server receives the request but rejects the service.

404 Not found // The requested resource does not exist. For example, the incorrect URL is entered.

500 internal server error // unexpected Server Error

503 server unavailable // server when the client cannot be handled before,

// May return to normal

Eg: HTTP/1.1 200 OK (CRLF)

2. Post-Response Header

3. The response body is the content of the resource returned by the server.

Iv. Explanation of HTTP protocol

An HTTP message consists of a client-to-server request and a server-to-client response. Both request message and Response Message start from

Line (for request messages, the start line is the request line, and for response messages, the start line is the status line), the message header (optional), and the empty line (only

Contains CRLF lines and optional message bodies.

HTTP message headers include common headers, request headers, response headers, and object headers.

Each header field consists of the name + ":" + space + value. The name of the message header field is case-insensitive.

 1. Common Header

In a common header, a few header fields are used for all request and response messages, but not for transmitted entities.

Message.

Eg:

Cache-control is used to specify cache commands. cache commands are unidirectional (Cache commands in the response may not

And is independent (the cache command of one message does not affect the cache mechanism of the other message processing). http1.0 uses a similar header domain Pragma.

Request cache Commands include: No-Cache (used to indicate that the request or response message cannot be cached), no-store, Max-age,

Max-stale, Min-fresh, only-if-cached;

Cache commands for response include public, private, no-cache, no-store, no-transform,

Must-revalidate, proxy-revalidate, Max-age, S-maxage.

Eg:

To instruct the IE browser (client) Not to cache pages, the JSP program on the server can be written as follows:

Response. sehheader ("cache-control", "No-Cache ");

// Response. setheader ("Pragma", "No-Cache"); equivalent to the above Code, usually both //

This Code sets the common header domain: cache-control: No-cache in the sent response message.

Date common header field indicates the date and time of message generation

The connection common header field allows sending the specified connection option. For example, if the specified connection is continuous or the "close" option is specified,

Notifies the server that the connection is closed after the response is complete.

2. Request Header

The request header allows the client to send additional request information and client information to the server.

Common request headers

Accept

The accept request header field is used to specify the types of information the client accepts. Eg: accept: image/GIF, indicating the client

You want to accept resources in the GIF image format. Accept: text/html indicates that the client wants to accept HTML text.

Accept-charset

Accept-charset: please use the header field to indicate the character set received by the client. Eg:

Accept-charset: iso-8859-1, gb2312. if this field is not set in the request message, the default is any character set

To accept.

Accept-Encoding

The accept-encoding Request Header domain is similar to accept, but it is used to specify acceptable content encoding. Eg:

Accept-encoding: gzip. Deflate. If the domain server is not set in the request message, assume that the client

All codes are acceptable.

Accept-Language

The accept-language Request Header domain is similar to accept, but it is used to specify a natural language. Eg:

Accept-language: ZH-CN. If this header field is not set in the request message, the server assumes that the client can

Accept.

Authorization

The authorization request header domain is used to prove that the client has the right to view a resource. When a browser accesses a page, such

If you receive the server's Response Code 401 (unauthorized), you can send a request containing the authorization request header domain, requiring the server to verify it.

Host (this header field is required when a request is sent)

The host request header field is used to specify the Internet host and port number of the requested resource. It is usually extracted from the HTTP URL

Come out,

Eg:

We enter: http://www.guet.edu.cn/index.html in the browser

The request message sent by the Browser contains the host Request Header domain, as follows:

HOST: www.guet.edu.cn

The default port number is 80. If the port number is specified, it is changed to: Host: www.guet.edu.cn: the specified port number.

User-Agent

When we log on to the forum online, we will often see some welcome information, which lists the names and versions of your operating system, the names and versions of your browsers, this is often amazing for many people. In fact, the server application obtains this information from the User-Agent Request Header domain. The User-Agent request header field allows the client to tell the server its operating system, browser, and other attributes. However, this header field is not required. If we write a browser, we do not use User-Agent.

Request Header domain, so the server cannot know our information.

Example of request header:

GET/form.html HTTP/1.1 (CRLF)

Accept: image/GIF, image/X-xbitmap, image/JPEG, application/X-Shockwave-flas

H, application/vnd. MS-Excel, application/vnd. MS-PowerPoint, application/MSWord,

*/* (CRLF)

Accept-language: ZH-CN (CRLF)

Accept-encoding: gzip, deflate (CRLF)

If-modified-since: Wed, 05 Jan 2007 11:21:25 GMT (CRLF)

If-None-Match: W/"80b1a4c018f3c41: 8317" (CRLF)

User-Agent: Mozilla/4.0 (compatible; msie6.0; Windows NT 5.0) (CRLF)

HOST: www.guet.edu.cn (CRLF)

Connection: keep-alive (CRLF)

(CRLF)

3. Response Header

The Response Header allows the server to transmit additional response information that cannot be placed in the status line, as well as information about the server and

The information of the resource identified by request-URI for next access.

Common Response Headers

Location

The location response header field is used to redirect the receiver to a new location. Location Response Header domain is often used when the domain name is changed

.

Server

The server response header contains the software information used by the server to process requests. Corresponds to the User-Agent Request Header domain

. Below is

An example of the server response header domain:

Server: APACHE-Coyote/1.1

WWW-Authenticate

The WWW-authenticate response header field must be included in the 401 (unauthorized) response message, and the client receives the 401 response

When the server sends an Authorization Header domain request to verify the message, the server response header contains this header domain.

Eg: www-Authenticate: Basic realm = "basic auth test! "// You can see that the server has requested

The source uses the basic verification mechanism.

4. Object Header

Both request and response messages can be transmitted as an entity. An object consists of an object header domain and an object body, but it does not mean an object header.

The domain and the Object Body must be sent together. You can send only the object header domain. The object header defines the Object Body (eg: whether the object body exists)

And the metadata of the resource identified by the request.

Common Object Headers

Content-Encoding

The content-encoding object header field is used as a modifier of the media type. Its value indicates that the object has been applied to the Object Body.

To obtain the media types referenced in the Content-Type header field, the corresponding decoding mechanism must be used.

Such as content-encoding, which is used to record the File compression method, eg: Content-encoding: Gzip

Content-language

The content-language object header field describes the natural language used by the resource. If this field is not set, the object content will be provided

Read in all languages

. Eg: Content-language: da

Content-Length

The Content-Length object header field is used to specify the length of the Object Body, which is represented by a decimal number stored in bytes.

Content-Type

The Content-Type object header field specifies the media type of the Object Body sent to the recipient. Eg:

Content-Type: text/html; charset = ISO-8859-1

Content-Type: text/html; charset = gb2312

Last-modified

The last-modified object header field is used to indicate the last modification date and time of the resource.

Expires

The expires object header field specifies the response expiration date and time. To allow the proxy server or browser to update after a period of time

Cache (when you access a previously visited page again, load it directly from the cache to shorten the response time and reduce the server load, we can use the expires object header field to specify the page expiration time. Eg: expires: Thu, 15 Sep 2006 16:23:12

The client and cache of gmthttp1.1 must regard other illegal date formats (including 0) as expired. Eg: to make the browser

You do not need to save the page. We can also use expires to set the header domain to 0, as shown in the following process sequence in JSP:

Response. setdateheader ("expires", "0 ");

5. Use telnet to observe the communication process of the HTTP protocol

Purpose and principle of the experiment:

Using the MS Telnet tool, by manually entering the HTTP request information, send a request to the server, the server receives,

After the request is interpreted and accepted, a response will be returned, which will be displayed in the Telnet window, thus enhancing the sensitivity of HTTP

Understanding of the protocol communication process.

Tutorial steps:

1. Enable Telnet

1.1 Enable Telnet

Run --> cmd --> Telnet

1.2 Enable telnet echo

Set localecho

2. Connect to the server and send a request

2.1 Open www.guet.edu.cn 80 // note that the port number cannot be omitted

Headers/index. asp HTTP/1.0

HOST: www.guet.edu.cn

/* You can change the Request Method and request the content of the Guilin homepage. Enter the following message */

Open www.guet.edu.cn 80

GET/index. asp HTTP/1.0 // request resource content

HOST: www.guet.edu.cn

2.2 Open www.sina.com.cn 80 // enter Telnet www.sina.com.cn directly under the command prompt symbol

80

Headers/index. asp HTTP/1.0

HOST: www.sina.com.cn

3. Experiment results:

3.1 Request Information 2.1 the response is:

HTTP/1.1 200 OK // request successful

Server: Microsoft-IIS/5.0 // web server

Date: Thu, 08 mar 200707: 17: 51 GMT

Connection: keep-alive

Content-Length: 23330

Content-Type: text/html

Expries: Thu, 08 Mar 2007 07:16:51 GMT

Set-COOKIE: aspsessionidqaqbqqqb = bejcdgkadedjklkkajeoimmh; Path =/

Cache-control: Private

// Resource content omitted

3.2 Request Information 2.2 The response is:

HTTP/1.0 404 Not found // request failed

Date: Thu, 08 Mar 2007 07:50:50 GMT

Server: Apache/2.0.54 <UNIX>

Last-modified: Thu, 30 Nov 2006 11:35:41 GMT

Etag: "6277a-415-e7c76980"

Accept-ranges: bytes

X-powered-by: mod_xlayout_tables/0.0.1vhs.markii.remix

Vary: Accept-Encoding

Content-Type: text/html

X-Cache: Miss from zjm152-78.sina.com.cn

Via: 1.0 zjm152-78.sina.com.cn: 80 <squid/2.6.stables-20061207>

X-Cache: Miss from th-143.sina.com.cn

Connection: Close

Lost connection to the host

Press any key to continue...

4. Notes

If an input error occurs, the request will not succeed.

Header fields are case-insensitive.

For more information about HTTP, see rfc2616, which is found on the http://www.letf.org/rfc

.

The development background program must master the HTTP protocol

Vi. Technical supplements related to HTTP protocol

1. Basics

High-level protocols include file transfer protocol FTP, email transmission protocol SMTP, Domain Name System Service DNS, and network news transmission protocol

NNTP and HTTP protocols

There are three types of mediation: proxy, gateway, and tunnel. A proxy is based on the absolute format of the URI.

Accept the request, rewrite all or part of the message, and send the formatted request to the server with the uri id. The gateway is a receiver.

Proxy, which serves as the upper layer of some other servers. If necessary, you can translate the request to the lower layer server protocol. One connection

Channel serves as a relay point between two connections that do not change messages. A channel is often used when communication requires an intermediary (such as a firewall) or an intermediary that cannot identify messages.

Proxy: An intermediate program that can act as a server or a client and create

Request. Requests are transmitted to other servers through possible translation. A proxy must

It must be explained and, if possible, rewrite it. A proxy is often used as a portal through a firewall client. A proxy can also be used as a help application to handle requests that are not completed by a user proxy through the Protocol.

Gateway: a server that acts as an intermediate medium for other servers. Unlike the proxy, the gateway accepts the request as if

For the requested resource, it is the source server; the client that sends the request does not realize that it is dealing with the gateway.

The gateway is often used as a portal for servers that use firewalls. The Gateway can also be used as a protocol translator to access

Resources in the HTTP system.

Tunnel: it is an intermediary program used as two connection relay. Once activated, the channel is considered not to belong to HTTP Communication.

The channel may be initialized by an HTTP request. When the two ends of the relay connection are closed, the channel disappears. The channel is frequently used when a portal must exist or intermediary cannot interpret the relay communication.

2. Protocol Analysis advantages-HTTP analyzer detects Network Attacks

Analyzing and processing high-level protocols in a modular manner will be the direction of future intrusion detection.

Common ports 80, 3128, and 8080 of HTTP and its proxies are specified using the port label in the network section.

3. HTTP Content lenth restriction vulnerability resulting in DoS Attacks

When using the POST method, you can set contentlenth to determine the data length to be sent, for example, contentlenth: 999999999. Before the transfer is complete, internal Storage will not be released. Attackers can exploit this vulnerability to continuously send junk data to the Web server until the Web server memory is exhausted. This attack method basically does not leave any trace.

Http://www.cnpaf.net/Class/HTTP/0532918532667330.html

4. conception of DoS attacks using the characteristics of HTTP

The server is busy processing the attacker's forged TCP connection requests and ignoring the client's normal requests (after all, the client's normal request rate is very small). From the perspective of normal customers, the server loses response, which is called synflood attack (SYN Flood attack) on the server ). Smurf and Teardrop use ICMP packets to attack flood and IP fragments. This article uses the "normal connection" method to generate DoS attacks. Port 19 has been used for chargen attacks in the early stage, that is, chargen_denial_of_service,! The method they use is to generate a UDP connection between the two chargen servers so that the server can process too much information and get down. Therefore, there must be two conditions for killing a web server: 1. chargen service 2. there are HTTP service methods: attackers forge source IP addresses and send CONNECT requests to N chargen servers. After chargen receives the connection, it returns a 72-byte rst stream per second (based on the actual network situation, this is faster) to the server.

5. Http Fingerprint Recognition Technology

The principle of HTTP fingerprint recognition is basically the same: records the tiny differences in HTTP protocol execution by different servers

No. HTTP fingerprint recognition is much more complex than TCP/IP stack fingerprint recognition, because custom HTTP server configuration files, adding plug-ins or components make it easy to change HTTP response information, this makes it difficult to identify; however, the custom TCP/IP stack behavior needs to be modified on the core layer, so it is easy to identify.

It is easy to set the server to return different banner information. An open-source HTTP service such as Apache

You can modify the banner information in the source code, and then restart the HTTP service to take effect. For HTTP servers without open source code, such as Microsoft IIS or Netscape, you can modify it in the DLL file where the banner information is stored. Relevant Articles are discussed. I will not repeat it here. Of course, the modification effect is good. another method to blur banner information is to use plug-ins.

Common Test requests:

1: Send basic HTTP requests to head/HTTP/1.0

2: delete/HTTP/1.0 sends unpermitted requests, such as delete requests

3: Get/HTTP/3.0 sends an invalid HTTP Request

4: Get/junk/1.0 sends an incorrect HTTP Request

HTTP fingerprint recognition tool httprint, which combines fuzzy logic technology by applying statistical principles, can be effectively identified

The type of the HTTP server. It can be used to collect and analyze the signatures generated by different HTTP servers.

6. Others

To improve the performance of browsers, modern browsers also support concurrent access. When you browse a Web page, multiple connections are established at the same time to quickly obtain multiple icons on a web page, in this way, the entire webpage can be transmitted more quickly.

Http1.1 provides this continuous connection method, while the next-generation HTTP protocol: HTTP-NG adds session control,

Supports rich content negotiation and other methods to provide more efficient connections.


This article is from the "7439523" blog, please be sure to keep this source http://7449523.blog.51cto.com/7439523/1547625

HTTP protocol details

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.