HTTP protocol Detailed

Source: Internet
Author: User
Tags response code domain server file transfer protocol

HTTP is an object-oriented protocol belonging to the application layer, which is suitable for distributed hypermedia information System because of its simple and fast way. It was proposed in 1990, after several years of use and development, has been continuously improved and expanded. Currently used in the WWW is the sixth edition of Http/1.0, http/1.1 standardization work is in progress, and Http-ng (NextGeneration of HTTP) has been proposed. The main features of the HTTP protocol can be summarized as follows:1. Support Customer/Server mode. 2.Simple and fast: When a customer requests a service from the server, it simply transmits the request method and path. The request method commonly has, POST. Each method specifies a different type of contact between the customer and the server. Because the HTTP protocol is simple, the HTTP server's program size is small, so the communication speed is fast. 3. Flexible: HTTP allows the transfer of any type of data object. The type being transmitted is by content-type to be marked. 4.no connection: The meaning of no connection is to limit the processing of only one request per connection. When the server finishes processing the customer's request and receives the customer's answer, the connection is disconnected. In this way, the transmission time can be saved. 5.stateless: The HTTP protocol is a stateless protocol. Stateless means that the protocol has no memory capacity for transactional processing. A lack of state means that if the previous information is required for subsequent processing, it must be re-routed, which may cause the amount of data to be transferred per connection to increase. On the other hand, it responds faster when the server does not need the previous information. First, HTTP protocol detailed URL article http (Hypertext Transfer Protocol) is a request-and-response mode-based, stateless, application-level protocol, often based on TCP connection mode, HTTP1.A continuous connection mechanism is given in version 1, and the vast majority of web development is a Web application built on top of the HTTP protocol. The HTTP URL (URL is a special type of URI that contains enough information to find a resource) in the following format: HTTP://host[":" Port][abs_path]HTTP means to locate network resources through the HTTP protocol, the host represents a legitimate Internet host domain name or IP address; port specifies a port number, or null specifies the URI of the requested resource using the default port 80;abs_path, if no abs_ is given in the URL Path, then when it is requested as a URI, it must be "/"In the form given, usually this work browser automatically helps us complete. Eg: 1, Input: www.guet.edu.CN browser automatically converted to: HTTP://www.guet.edu.cn/2, Http:192.168.0.116:8080/index.JSP two, HTTP protocol details of the request of the HTTP request consists of three parts, namely: Request line, message header, request body1. The request line begins with a method symbol, separated by a space, followed by the requested URI and version of the Protocol, in the following format: Method Request-uri http-Version CRLF where method means the request;-uri is a uniform resource identifier; http-version represents the HTTP protocol versions of the request, and CRLF represents carriage return and line feeds (except for CRLF that are terminated as a separate CR or LF character). The request method, which is all capitalized, is explained in the following ways: Get requests GET request-The URI identifies the resource post in the request-The URI identifies the resource after attaching the new data to the head request obtained by requesting-The URI identifies the resource that the response message header of the put Request server stores a resource and uses the request-The URI acts as its identity delete request server to delete requests-The URI identifies the resource that the trace request server sends back the requested information received, primarily for testing or diagnosing the performance of the Connect reservation query server using the options request in the future. or query the resource-related options and Requirements application Example: Get method: When you access a Web page by entering a URL in the address bar of the browser, the browser uses the Get method to fetch resources to the server, eg: get/form.html http/1.1The (CRLF) Post method requires the requested server to accept data appended to the request and is often used to submit the form. Eg:post/reg.jsp http/(CRLF) Accept: Image/gif,image/x-xbit,...(CRLF)...HOST: www.guet.edu.cn (CRLF) Content-length:22(CRLF) Connection: keep-Alive (CRLF) Cache-control:no-Cache (CRLF) (CRLF)//the CRLF indicates that the message header has ended, preceded by a message headeruser=jeffrey&pwd=1234//The following line is the submitted dataThe head method is almost the same as the Get method, and for the response part of the head request, the information contained in the HTTP header is the same as the information obtained through the GET request. Using this method, you can get the request without transmitting the entire resource content.-The information for the resource identified by the URI. This method is commonly used to test the validity of hyperlinks, whether they can be accessed, and whether they have been updated recently. 2, the request header is described later3, request body (slightly) three, the response of the HTTP protocol in detail after receiving and interpreting the request message, the server returns an HTTP response message. The HTTP response is also made up of three parts: status line, message header, response body1, the status line format is as follows: HTTP-version Status-code reason-Phrase CRLF where the HTTP-version represents the version of the server HTTP protocol; Status-code represents the response status code sent back by the server; reason-phrase represents a textual description of the status code. The status code consists of three digits, the first number defines the category of the response, and there are five possible values: 1xx: Indication Information--indicates that the request has been received and continues processing 2xx: Success--indicates that the request has been successfully received, understood, accepted 3xx: redirect--further action is required to complete the request 4xx: Client Error--request syntax error or request cannot be implemented 5XX: server-side error--the server failed to implement a legitimate request common status code, status Description, Description:OK//Client Request succeededRequest//client requests have syntax errors and cannot be understood by the server401 Unauthorized//request is not authorized, this status code must be used with the Www-authenticate header field403 Forbidden//the server received the request but refused to provide the service404 Not Found//Request resource does not exist, eg: the wrong URL was enteredInternal Server Error//Unexpected error occurred on server503 Server Unavailable//the server is currently unable to process client requests and may return to normal after some timeeg:http/1.1 200OK (CRLF)2, the response header describes3, the response body is the contents of the resources returned by the server. The HTTP protocol is a message header that consists of a client-to-server request and a server-to-client response. Both the request message and the response message are from the start line (for the request message, the start line is the request line, for the response message, the start line is the status line), the message header (optional), the empty line (only the CRLF line), and the message body (optional) is composed. The HTTP message header includes the normal header, the request header, the response header, and the entity header. Each header field is made up of a name+ ":" + Space +value, the name of the message header field is case-insensitive. 1, normal header in the normal header, a few header fields are used for all request and response messages, but not for the transferred entity, only for the transmitted message. Eg:cache-control is used to specify the cache instruction, the cache instruction is unidirectional (the cache instruction appearing in the response may not appear in the request), and is independent (the cache instruction of one message does not affect the caching mechanism of another message processing), HTTP1.0 A similar header field is used for the pragma. The cache instructions at the time of the request include: No-cache (used to indicate that a request or response message cannot be cached), No-store,Max-age,Max-stale,min-fresh, only-if-cached; Cache directives when responding include: Public、Private, No-cache, No-store, No-transform, Must-revalidate, Proxy-revalidate,Max-age, S-maxage.eg: in order to instruct IE browser (client) not to cache the page, the server-side JSP program can be written as follows: Response. Sehheader ("Cache-control", "No-cache");//Response.setheader ("Pragma", "No-cache"), function equivalent to the above code, usually both//sharedThis code will set the normal header field in the Sent response message: cache-control:no-cachedate The normal header field indicates the date and time the message was generated connection the normal header field allows the option to send a specified connection. For example, specify that the connection is contiguous, or specify a "close" option to notify the server to close the connection after the response is complete2, the request header request header allows the client to pass additional information about the request to the server side, as well as the client itself. The common request Header Acceptaccept request header field is used to specify which types of information the client accepts. Eg:Accept:image/gif, indicating that the client wants to accept the GIF image format resources; accept:text/HTML, which indicates that the client wants to accept HTML text. Accept-charsetacceptThe-charset request header field is used to specify the character set accepted by the client. eg:accept-charset:iso-8859-1,gb2312.If the field is not set in the request message, the default is to accept any character set. Accept-encodingacceptThe-encoding request header field is similar to accept, but it is used to specify acceptable content encoding. Eg:accept-encoding:gzip.deflate.If this domain server is not set in the request message, the client is assumed to be acceptable for various content encodings. Accept-languageacceptThe-language request header field is similar to accept, but it is used to specify a natural language. EG:ACCEPT-LANGUAGE:ZH-CN.If the header field is not set in the request message, the server assumes that the client is acceptable for each language. The Authorizationauthorization request header domain is primarily used to prove that a client has permission to view a resource. When a browser accesses a page, if a response code of 401 (unauthorized) is received from the server, a request containing the authorization request header domain can be sent, requiring the server to validate it. Host (the header domain is required when sending a request) the host request header domain is used primarily to specify the Internet host and port number of the requested resource, which is usually extracted from the HTTP URL, eg: we enter in the browser: HTTP://www.guet.edu.cn/index.htmlin the request message sent by the browser, the Host request header field is included, as follows: Host:www. guet.edu.CN Use the default port number 80, if the port number is specified, it becomes: host:www. guet.edu.cn:Specify port number user-Agent when we go online to the forum, often see some welcome information, which lists the name and version of your operating system, the name and version of the browser you are using, which often makes a lot of people feel amazing, in fact, the server application is from the user-agent This information is obtained in the header domain of this request. The User-agent request header domain allows the client to tell the server about its operating system, browser, and other properties. However, this header field is not required, and if we write a browser ourselves, we do not use user-Agent request Header domain, then the server side will not know our information. Request Header Example: GET/form.html http/1.1(CRLF) Accept: image/gif,image/x-xbitmap,image/jpeg,application/x-shockwave-flash,application/vnd.ms-excel,application/ vnd.ms-powerpoint,application/msword,*/* (CRLF) ACCEPT-LANGUAGE:ZH-CN (CRLF) accept-encoding:gzip,deflate (CRLF) if-modified-since:wed,05 Jan 11:21:25 GMT (CRLF) if-none-match:w/"80b1a4c018f3c41:8317" (CRLF) user-agent:mozilla/4.0 (compatible; MSIE6.0; Windows NT 5.0) (CRLF) Host:www.guet.edu.cn (CRLF) connection:keep-alive (CRLF) (CRLF) 3, Response header response header allows the server to pass additional response information that cannot be placed in the status line , as well as information about the server and the next access to the resources identified by Request-uri. The commonly used response Header Locationlocation response header field is used to redirect the recipient to a new location. Location response header fields are commonly used when changing domain names. The Serverserver response header field contains the software information that the server uses to process the request. Corresponds to the User-agent request header field. The following is an example of the Server response header field: The Server:apache-coyote/1.1www-authenticatewww-authenticate response header field must be included in the 401 (unauthorized) response message, When the client receives a 401 response message and sends the authorization header domain to the request server to validate it, the service-side response header contains the header domain.  Eg:www-authenticate:basic realm= "Basic Auth test!" You can see that the server is using a Basic authentication mechanism for the requested resource. 4. Entity header request and response messages can all be routed one entity. An entity consists of an Entity header field and an entity body, but it does not mean that the entity header fields and entity bodies are sent together, and only the entity header fields can be sent. The entity header defines the meta-information about the entity body (eg: there is no entity body) and the resource identified by the request. The common entity Header content-encodingcontent-encoding entity header field is used as a modifier for the media type, and its value indicates the encoding of additional content that has been applied to the entity body, thus obtaining the media type referenced in the Content-type header field , the corresponding decoding mechanism must be adopted. Content-encoding This method of compressing the document, eg:The Content-encoding:gzipcontent-languagecontent-language Entity header field describes the natural language used by the resource. The domain is not set and the entity content is considered to be available to all language readers. The Eg:content-language:dacontent-lengthcontent-length Entity header field is used to indicate the length of the entity body, expressed as a decimal number stored in bytes. The Content-typecontent-type Entity header field term indicates the media type that is sent to the recipient's entity body. eg:content-type:text/html;charset=iso-8859-1content-type:text/html;charset= The Gb2312last-modifiedlast-modified Entity header field is used to indicate the last modification date and time of the resource. The Expiresexpires Entity header field gives the date and time when the response expires. In order for a proxy server or browser to update the cache after a period of time (once again accessing pages that have been visited, loading directly from the cache, shortening response times, and reducing server load), we can use the Expires entity header domain to specify when the page expires. eg:expires:thu,15 SEP 2006 16:23:12 GMTHTTP1.1 client and cache must treat other illegal date formats (including 0) as expired. Eg: in order to let the browser do not cache the page, we can also take advantage of the Expires entity header domain, set as 0,jsp in the program as follows: Response.setdateheader ("Expires", "0"); Five, using Telnet to observe the HTTP protocol communication Process Experimental purpose and principle: the use of Ms Telnet Tool, by manually entering the HTTP request information to the server to make a request, the server receives, interprets and accepts the request, will return a response,    The response is displayed on the Telnet window to deepen the perception of the HTTP protocol's communication process. Experiment steps: 1, open telnet1.1 open telnet run-->cmd-->telnet1.2 open telnet echo function set Localecho2, connect server and send request 2.1 open www.guet.edu.cn 80//Note the port number cannot be omitted head/index.asp http/1.0 Host:www.guet.edu.cn/* We can transformRequest method, request Guilin Electronic homepage content, enter the message as follows */Open www. guet.edu.cn 80GET/index.asp http/1.0//request the contents of a resourceHost:www.guet.edu.cn2.2 Open www.sina.com.cn 80//enter Telnet www.sina.com.cn directly under the command prompt symbolHead/index.asp http/1.0Host: www.sina.com.cn3Experimental Results:3.1 Request Information 2.1 The resulting response is:HTTP/1.1 OK//Request succeededserver:microsoft-iis/5.0//Web ServerDate: thu,08 Mar 200707:17:51gmtconnection: keep-Alive Content-length:23330Content-type:text/htmlexpries: thu,08 Mar 2007 07:16:51Gmtset-COOKIE:ASPSESSIONIDQAQBQQQB=BEJCDGKADEDJKLKKAJEOIMMH; path=/Cache-control:Private//omission of resource contents3.2 Request Information 2.2 The resulting response is:HTTP/1.0 404 Not Found//request failedDate: Thu, Mar 2007 07:50:50Gmtserver: apache/2.0.54 <Unix> Last-modified:thu, 2006 11:35:41Gmtetag: "6277a-415-e7c76980"Accept-ranges:Bytesx-powered-by:mod_xlayout_jh/0.0.1vhs.markii.remixvary: accept-encodingcontent-type:text/HTMLX-cache:miss from Zjm152-78.sina.com.Cnvia: 1.0 zjm152-78.sina.com.cn:80<squid/2.6.stables-20061207>X-cache:miss from Th-143.sina.com.cnconnection:close loses the connection to the host press any key to continue... 4. Precautions: 1, an input error occurs and the request does not succeed. 2, the header field is not case-sensitive. 3, a deeper understanding of the HTTP protocol, you can view RFC2616, in http://Locate the file on the WWW.LETF.ORG/RFC. 4, Development daemon must master HTTP protocol VI, HTTP protocol related technical supplement1, Basic: High-level protocols include: File Transfer Protocol FTP, e-mail Transport protocol SMTP, Domain Name System service DNS, Network News Transfer Protocol NNTP and HTTP protocol, such as mediation by three kinds: proxy, gateway and channel (tunnel), An agent accepts requests based on the absolute format of the URI, rewrites all or part of the message, and sends the formatted request to the server through the URI's identity. The gateway is a receiving agent that acts as the upper layer of some other servers and, if necessary, translates the request to the underlying server protocol. A channel acts as a relay point between two connections that do not change the message.     The channel is often used when the communication needs to pass through an intermediary (for example, a firewall, etc.) or if the content of the message is not recognized by the intermediary. Proxy: An intermediary program that can act as a server or as a client to establish requests for other clients. Requests are either internally or passed to other servers through possible translations. An agent must interpret and overwrite it if possible before sending the request information. Proxies are often used as portals through the firewall's client side, and proxies can be used as a help app to handle requests that are not completed by the user agent through the protocol. Gateway: A server that acts as an intermediary for other servers. Unlike the proxy, the gateway accepts the request as if it were the source server for the requested resource, and the requesting client is unaware that it is dealing with the gateway.    Gateways are often used as server-side portals through firewalls, and gateways can be used as a protocol translator to access resources stored in non-HTTP systems. Channel (tunnel): is a broker that is a relay of two connections. Once activated, the channel is considered not to be an HTTP communication, although the channel may be initialized by an HTTP request. The channel disappears when both ends of the relayed connection are closed. A channel is often used when a portal must exist or the intermediary (intermediary) cannot interpret the relay's traffic. 2, the advantage of Protocol Analysis-http Analyzer to detect network attacks in a modular manner to the High-level protocol analysis and processing, will be the direction of future intrusion detection. Common ports 80, 3128, and 8080 for HTTP and its proxies are specified in the network section with the port tag.3, HTTP protocol Content lenth restriction vulnerability causes a denial of service attack when using the Post method, you can set Contentlenth to define the length of the data that needs to be transferred, such as Contentlenth: 999999999, the memory is not released until the transfer is complete, and an attacker can use the flaw to continuously send spam data to the Web server until the Web server runs out of memory. This method of attack does not leave a trace. HTTP://www.cnpaf.net/Class/HTTP/0532918532667330.html4, some ideas of denial-of-service attacks using the features of the HTTP protocol the server is busy processing an attacker's bogus TCP connection request without having to respond to a client's normal request (after all, the client's normal request rate is very small), and from a normal customer's point of view, the server loses its response. This is what we call a Synflood attack on the server side (SYN flood attack). Smurf, teardrop and so on are using ICMP packets to flood and IP fragment attacks. This article uses a "normal connection" method to generate a denial of service attack. 19 ports in the early days already someone used to do chargen attacks, namely Chargen_denial_of_service, but! The way they use it is to create a UDP connection between two Chargen servers to get the server to handle too much information and down, so the condition of killing a Web server must be 2:1. Have Chargen service 2.there is an HTTP service method: The attacker forges the source IP to send a connection request (connect) to n Chargen, and when the connection is received the Chargen returns a stream of 72 bytes per second (in fact, this is faster than the actual network) to the server. 5HTTP Fingerprint recognition technology the principle of HTTP fingerprint recognition is basically the same: recording different servers to identify the minor differences in HTTP protocol execution. HTTP fingerprinting is much more complex than TCP/IP stack fingerprinting, because customizing the HTTP server's configuration file, adding plug-ins or components makes it easy to change the response information of HTTP, which makes recognition difficult; However, customizing the behavior of the TCP/IP stack requires modifying the core layer. So it's easy to identify.It is very simple to have the server return different banner information settings ., open source HTTP server such as Apache, the user can modify the banner information in the source code, and then restart the HTTP service to take effect, for the HTTP server without public source code such as Microsoft's IIS or Netscape, Can be stored in the banner information DLL file modification, related articles are discussed, here no longer repeat, of course, the effect of such a modification is good.another way to blur banner information is to use plugins. Common Test requests:1:head/http/1.0 Sending a basic HTTP request2:delete/http/1.0 send those requests that are not allowed,such as a delete request3:get/http/3.0 sending an illegal version of the HTTP protocol request4:get/junk/1.0 Send an incorrect specification for HTTP protocol Request HTTP Fingerprint identification Tool Httprint, it can effectively determine the type of HTTP server by using the principle of statistics and combining fuzzy logic technology.It can be used to collect and analyze signatures generated by different HTTP servers. 6, other: In order to improve the user's performance when using the browser, modern browsers also support concurrent access, browse a Web page while establishing multiple connections, to quickly obtain a number of icons on a Web page, so that the entire Web page can be faster to complete the transmission. HTTP1This continuous connection is provided in. 1, while the next generation HTTP protocol: http-Ng adds support for session control, rich content negotiation, and more to provide more efficient connections. Thank you, Mr. von Neumann .He was the first computer in the world, which made us descendants Niaoqianghuanpao, "Academic bandits" by "scissors plus paste" to "Academic pirates" of "Mouse plus clipboard".thanks to the teacher in charge of the replyIn the case that I don't understand what is being written, they only ask me two questions--do they know what they're writing? Do you know the references? And then I passed the plea. They are such amiable teachers, they are so considerate teachers, they are so approachable and great teachers.

HTTP protocol explanation (RPM)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.