HTTP protocol, HTTP protocol principle analysis

Source: Internet
Author: User
Tags ranges unsupported

Baidu Encyclopedia Description:

Hypertext Transfer Protocol (Http,hypertext Transfer Protocol) is one of the most widely used network protocols on the Internet. All WWW documents must comply with this standard. HTTP was originally designed to provide a way to publish and receive HTML pages. 1960 American Ted Nelson conceived a way to process text messages through a computer called hypertext (hypertext), which has become the foundation of the HTTP Hypertext Transfer Protocol Standard architecture. The Ted Nelson organization coordinated the World Wide Web Association (Wide) and the Internet Engineering Working Group (Internet Engineering Task Force) to work together to study and eventually release a series of RFCs, of which the famous RFC 2616 defines the HTTP 1.1.

The HTTP protocol (hypertext Transfer Protocol, Hypertext Transfer Protocol) is the transfer protocol used to transfer the text from the WWW server to the local browser. It can make the browser more efficient and reduce the network transmission. It not only ensures that the computer transmits hypertext documents correctly and quickly, but also determines which part of the document is being transmitted, and which content is displayed first (such as text before graphics), and so on.

HTTP is an application-layer protocol that consists of requests and responses and is a standard client server model. HTTP is a stateless protocol.

Technical Architecture HTTP is a standard (TCP) for client and server-side requests and responses. The client is the end user and the server side is the Web site. By using a Web browser, crawler, or other tool, the client initiates an HTTP request to the specified port on the server (the default port is 80). (We call this client) called the user agent. The answering server stores (some) resources, such as HTML files and images. This answering server (we call it) is the source server (Origin server). May exist between the user agent and the source serverHTTP and several other network protocolsMultiple tiers, such as proxies, gateways, or tunnels (tunnels). Although the TCP/IP protocol is the most popular application on the Internet, the HTTP protocol does not stipulate that it must be used and (based on) the layers it supports. In fact, HTTP can be implemented on any other Internet protocol, or on other networks. HTTP only assumes that (its underlying protocol provides) a reliable transmission, and any protocol that provides such assurances can be used by it. Typically, a request is initiated by an HTTP client to establish a TCP connection to the server-specified port (by default, port 80). The HTTP server listens on that port for requests sent by the client. Once the request is received, the server (to the client) sends back a status line, such as "http/1.1 OK", and (in response) message, the message body may be the requested file, error message, or some other information.Web page of the HTTP protocolThe reason HTTP uses TCP instead of UDP is that a Web page must transmit a lot of data, while the TCP protocol provides transport control, organizes the data sequentially, and corrects errors. The resources requested through the HTTP or HTTPS protocol are identified by the Uniform Resource Identifier (Uniform Resource Identifiers) (or, more accurately, URLs). The Protocol feature HTTP protocol (hypertext Transfer Protocol, Hypertext Transfer Protocol) is a transport protocol used to transmit hypertext to a local browser from a WWW server. It can make the browser more efficient and reduce the network transmission. It not only ensures that the computer transmits hypertext documents correctly and quickly, but also determines which part of the document is being transmitted, and which content is displayed first (such as text before graphics), and so on. HTTP is the application-layer communication protocol between a client browser or another program and a Web server. The hypertext information is stored on the Web server on the Internet, and the client needs to transmit the hypertext information it wants to access over the HTTP protocol. HTTP contains commands and transmission information that can be used not only for Web access, but also for communication between other Internet/intranet application systems, enabling the integration of hypermedia access for a variety of application resources. The website address that we enter in the address bar of the browser is called the URL (Uniform Resource Locator, Uniform Resource Locator). Just like every household has a house address, each page has an Internet address. When you're inhttp functionWhen you enter a URL in the Address box of the browser or click a hyperlink, the URL determines the address to be browsed. The browser uses Hypertext Transfer Protocol (HTTP) to extract Web page code from Web servers and translate them into beautiful web pages. The Protocol base HTTP (hypertext Transport Protocol) is an abbreviation for the Hypertext Transfer Protocol, which is used to transmit data in WWW mode, refer to RFC2616 for more information on HTTP protocol. The HTTP protocol uses the request/response model. The client sends a request to the server that contains the requested method, URL, protocol version, and a mime-like message structure that contains the request modifier, customer information, and content. The server responds with a status line that includes the version of the message protocol, success or error encoding plus the server information, entity meta information, and possible entity content. Typically HTTP messages include client-to-server request messages and server-to-client response messages. These two types of messages consist of a starting line, one or more header fields, a blank line that indicates the end of the head field, and an optional message body. The header fields of HTTP include the general header, the request header, the response header, and the four parts of the entity header. Each header field consists of a domain name, a colon (:), and a domain value of three parts. Domain names are case-insensitive, you can add any number of whitespace before the domain value, and the header field can be expanded to multiple lines, at the beginning of each line, with at least one space or tab.Request Response model for HTTP

The HTTP protocol is always a client-initiated request, and the server echoes the response. See:

This limits the use of the HTTP protocol, which cannot be implemented when the client does not initiate a request, the server pushes the message to the client.

The HTTP protocol is a stateless protocol, and there is no correspondence between this request and the last request of the same client.

The universal header domain contains header domains that both request and response messages support, and the generic header domain contains Cache-control, Connection, Date, Pragma, transfer-encoding, Upgrade, Via. The expansion of the universal header domain requires both parties to support this extension, and if there is an unsupported universal header domain, it will generally be handled as the entity header domain. The following is a brief introduction to several common header domains used in UPnP messages: The 1.cache-control header domain Cache-control Specifies the caching mechanism that requests and responses follow. Setting Cache-control in a request message or response message does not modify the caching process in another message processing process. The cache directives for the request include No-cache, No-store, Max-age, Max-stale, Min-fresh, only-if-cached, and the instructions in the response message include public, private, No-cache, No-store, No-transform, Must-revalidate, Proxy-revalidate, Max-age. The instructions in each message have the following meanings: Public indicates that the response can be cached by any buffer. Private indicates that the entire or partial response message for a single user cannot be shared with the cache. This allows the server to simply describe when the userHTTP StructurePart of the response message that is not valid for another user's request. No-cache indicates that a request or response message cannot be cached no-store is used to prevent important information from being inadvertently published. Sending in the request message will make the request and response messages do not use the cache. Max-age indicates that the client can receive a response that is not longer than the specified time (in seconds). Min-fresh indicates that the client can receive a response that is less than the current time plus a specified time. Max-stale indicates that the client can receive a response message that exceeds the timeout period. If you specify a value for the Max-stale message, the client can receive a response message that exceeds the specified value for the timeout period. The HTTP keep-alivekeep-alive feature makes the client-to-server connection persistent, and when a subsequent request to the server occurs, the Keep-alive feature avoids establishing or re-establishing the connection. Most Web servers on the market, including Iplanet, IIS, and Apache, support HTTP keep-alive. This feature is often useful for websites that provide static content. However, there is another problem with heavier sites: while retaining open connections for customers has some benefits, it also affects performance because the resources that could have been freed during the processing pause are still occupied. When the Web server and application server are running on the same machine, the Keep-alive feature has a particularly significant impact on resource utilization. The KeepAliveTime value controls how often TCP/IP attempts to verify that idle connections are intact. If there is no activity during this time, the keep-active signal is sent. If the network is working properly and the receiver is active, it responds. Consider reducing this value if you need to be sensitive to lost receivers, in other words, you need to discover the missing receivers faster. If the number of idle connections that have been inactive for a long time is more frequent, and fewer receivers are missing, you may want to increase the value to reduce the overhead. By default, Windows sends messages that remain active if there is no activity in an idle connection for 7200000 milliseconds (2 hours). Typically, 1800000 milliseconds is the preferred value, and half of the closed connections are detected within 30 minutes. The KeepAliveInterval value defines the frequency at which TCP/IP repeats the Keep-active signal if the response to the keep-active message is not received from the receiving party. The connection is discarded when the continuous sending of a keep-active signal, but the number of times the response is not received exceeds the value of TcpMaxDataRetransmissions. If you expect a longer response time, you may need to increase the value to reduce the overhead. If you need to reduce the time spent verifying that the receiver has been lost, consider reducing the value or TCPMAXDAThe taretransmissions value. By default, Windows waits 1000 milliseconds (1 seconds) before resending messages that remain active without receiving a response. KeepAliveTime is set according to your needs, such as 10 minutes, note to convert into Ms. XXX stands for this interval worth the size. 2.Date header field The Date Header field represents the time the message was sent, and the time description format was defined by RFC822. For example, Date:mon,31dec200104:25:57gmt. The time described by date represents the world standard, which translates into local time and needs to know the time zone in which the user is located. The 3.Pragma header domain pragma header domain is used to contain implementation-specific instructions, most commonly pragma:no-cache. In the http/1.1 protocol, it has the same meaning as Cache-control:no-cache.   Request message The first behavior of the request message is in the following format: Methodsprequest-urisphttp-versioncrlfmethod represents the case-sensitive method for Request-uri completion, including options ,, POST, PUT, DELETE, TRACE. The method get and head should be supported by all common Web servers, and the implementation of all other methods is optional. The GET method retrieves the information identified by the Request-uri. The head method also retrieves the information identified by the Request-uri, but does not return the body of the message when the response is available. The Post method can request that the server receive entity information contained in the request, and can be used to submit the form, sending messages to newsgroups, BBS, mail groups, and databases. The SP represents a space. Request-uri follows the URI format, where the word Cheweishing (*) indicates that the request is not used for a particular resource address, but rather for the server itself. Http-version represents the supported HTTP version, for example, http/1.1. The CRLF represents a newline carriage return character. The request header domain allows the client to pass an additional letter to the server about the request or about the clienthttp schema Interest. The Request header field may contain the following fields Accept, Accept-charset, accept-encoding, Accept-language, Authorization, from, Host, If-modified-since, If-match, If-none-match, If-range, If-range, If-unmodified-since, Max-forwards, Proxy-authorization, Range, Referer, User-agent. Extensions to the request header domain are supported by both parties, and if an unsupported request header domain exists, it will generally be handled as the entity header domain. Typical request message: host:download.*******.deaccept: */*pragma:no-cachecache-control:no-cacheuser-agent:mozilla/4.04[en] ( Win95;i; NAV) range:bytes=554554-the first line in the previous example indicates that the HTTP client (possibly a browser, downloader) obtains the file under the specified URL through the Get method. The brown portion represents the information for the Request header field, and the green section represents the General header section. The 1.Host header domain Host header domain Specifies the intenet host and port number of the requesting resource, and must represent the location of the originating server or gateway that requested the URL. The http/1.1 request must contain the host header domain or the system will return with a 400 status code. 2.Referer header domain Referer header domains allow clients to specify the source resource address of the request URI, which allows the server to generate a fallback list that can be used to log in, optimize the cache, and so on. He also allows the abolition or wrong connection to be traced for maintenance purposes. If the requested URI does not have its own URI address, Referer cannot be sent. If you specify a partial URI address, this address should be a relative address. The 3.Range header field Range header field can request one or more child ranges of an entity. For example, represents the first 500 bytes: bytes=0-499 represents the second 500 bytes: bytes=500-999 represents the last 500 bytes: bytes=-500 represents the range after 500 bytes: bytes=500-First and last byte: bytes= 0-0,-1 specifies several ranges: bytes=500-600,601-999 but the server can ignore this request header, and if the unconditional get contains a range request header, the response is returned as a status code of 206 (partialcontent) instead of a (OK). 4.user-agent header FieldThe contents of the User-agent header domain contain the user information that made the request. The first behavior of the response message response message is in the following format: Http-versionspstatus-codespreason-phrasecrlfhttp-version represents the supported HTTP version, for example, http/1.1. Status-code is a result code of three numbers. Reason-phrase provides a simple text description for Status-code. Status-code is mainly used for machine automatic identification, reason-phrase is mainly used to help users understand. The first number of Status-code defines the category of the response, and the latter two numbers do not have a role to classify. The first number may take 5 different values: 1xx: The information response class, which represents receiving the request and continues processing 2xx: Processing the Success response class, indicating that the action was successfully received, understood, and accepted 3xx: Redirect Response class, in order to complete the specified action, must accept further processing 4xx: Client error, The client request contains a syntax error or is not properly executed 5xx: The server does not correctly execute a correct request response header domain allows the server to pass additional information that cannot be placed on the status line, which primarily describes the server's information and Request-uri further information. The Response header field contains age, location, proxy-authenticate, public, Retry-after, Server, Vary, Warning, and Www-authenticate. The expansion of the response header field is required for both sides of the communication, and if there is an unsupported response header field, it will generally be handled as the Entity header field. Typical response message: http/1.0200okdate:mon,31dec200104:25:57gmtserver:apache/1.3.14 (Unix) content-type:text/ Htmllast-modified:tue,17apr200106:46:28gmtetag: "a030f020ac7c01:1e9f" Content-length:39725426content-range: bytes55******/40279980 the first line in the previous example represents an HTTP service-side response to a GET method. The brown part represents the Response header field information, the green part represents the General header section, and the red part represents the Entity header field information. The 1.Location response Header location response header is used to redirect the recipient to a new URI address. The 2.Server response Header server response header contains software information for the originating server that processed the request. This field can contain multiple product identifiers and annotations, and the productIdentities are generally sorted by importance. Both the entity Information Request message and the response message can contain entity information, and entity information is generally composed of entity header fields and entities. The Entity header field contains the original information about the entity, including allow, Content-base, content-encoding, Content-language, Content-length, Content-location, CONTENT-MD5, Content-range, Content-type, Etag, Expires, Last-modified, Extension-header. Extension-header allows clients to define new entity headers, but these domains may not be recognized by the recipient. An entity can be a coded stream of bytes encoded by content-encoding or Content-type, whose length is defined by content-length or Content-range. 1.content-type Entity Header Content-type Entity header is used to indicate the media type of the entity to the receiver, specifying the entity media type that the head method sends to the receiver. Or the Get method sends the request media Type 2.content-range entity header content-range The body header is used to specify the insertion position of a part of the entire entity, and he also indicates the length of the entire entity. When the server returns a partial response to the customer, it must describe the extent of the response coverage and the entire length of the entity. General format: Content-range:bytes-unitspfirst-byte-pos-last-byte-pos/entity-legth For example, transfer the first 500 bytes in the form of a field: Content-range: bytes0-499/1234 If an HTTP message contains this section (for example, a response to a range request or overlapping requests for a range of ranges), Content-range represents the range of delivery, Content-length represents the number of bytes actually transferred. 3.last-modified Entity Header last-modified Entity header specifies the last revision time to save content on the server. For example, the transfer header is in the form of a 500-byte secondary field: content-range:bytes0-499/1234 If an HTTP message contains this section (for example, a response to a range request or an overlapping request to a range of ranges), Content-range represents the range of the transfer, The content-length represents the number of bytes actually transferred. mode of operation in WWW, "Customer" and "server" is a relative concept that exists only during a particular connection period, that is, in a certainThe client in the connection may act as a server in another connection. The information exchange process of the client/server mode based on the HTTP protocol, which is divided into four processes: establishing the connection, sending the request information, sending the response information, and closing the connection. The HTTP protocol is based on the request/response paradigm. After a client establishes a connection to the server, it sends a request to the server in the form of a Uniform Resource Identifier, protocol version number, followed by MIME information including the request modifier, client information, and possible content. After the server receives the request, it gives the corresponding response information in the form of a status line that includes the protocol version number of the information, a successful or incorrect code, followed by MIME information including server information, entity information, and possible content.http mode of Operation In fact, it simply means that any server, in addition to HTML files, has an HTTP-resident program that responds to user requests. Your browser is an HTTP client that sends a request to the server, and when a start file is entered in the browser or a hyperlink is clicked, the browser sends an HTTP request to the server, which is sent to the URL specified by the IP address. The resident program receives the request and echoes the requested file after the necessary action is taken. In this process, the data sent and received on the network has been divided into one or more packets (packet), each packet includes: the data to be transmitted, control information, that is, tell the network how to process the packet. TCP/IP determines the format of each packet. If you do not tell you beforehand, you may not know that the information is divided into many small pieces for transmission and re-grouping. Many HTTP traffic is initialized by a user agent and includes a request to request resources on the source server. The simplest scenario might be a separate connection between the user agent (UA) and the source server (O). When one or more mediations appear in the request/response chain, the situation becomes more complex. There are three mediations: proxy, Gateway, and channel (tunnel). An agent accepts requests based on the absolute format of the URI, rewrites all or part of the message, and sends the formatted request to the server through the URI's identity. The gateway is a receiving agent that acts as the upper layer of some other servers and, if necessary, translates the request to the underlying server protocol. A channel acts as a relay point between two connections that do not change the message. The channel is often used when the communication needs to pass through an intermediary (for example, a firewall, etc.) or if the content of the message is not recognized by the intermediary. Message format HTTP messages consist of requests from the client to the server and responses from the server to the client. The request message format is as follows: Request line-General Information header-Request header-Entity header-the message body request line starts with the Method field, followed by the URL field and the HTTP Protocol version field, and ends with CRLF. SP is a delimiter. In addition to the last CRLF sequence CF and LF are required, others can be not. For general information headers, the specific contents of the request header and the entity header can be referenced in the relevant files. The response message format is as follows: status line-General information header-Response Head-Entity header-The message body status code element consists of 3 digits that indicate whether the request is understood or is satisfied. The cause analysis is a brief description of the status code of the original text, which is used to support automatic operation, and the reason analysis is used by the user. The client does not need to check or display the syntax. For the general information header, the response header and entity header aspects of the specific content can refer to the relevant files. How it works once an HTTP operation is called a transaction, its working process can be divided into four steps: first, the customerThe user and server need to establish a connection. As soon as you click on a hyperlink, the HTTP work begins. After the connection is established, the client sends a request to the server in the form of a Uniform Resource Identifier (URL), a protocol version number, followed by MIME information including the request modifier, client information, and possible content. After the server receives the request, it gives the corresponding response information in the form of a status line, including the protocol version number of the information, a successful or incorrect code, followed by MIME information including server information, entity information, and possible content. The information returned by the client receiving server is displayed on the user's display via the browser, and the guesthttp Workflow FlowchartDisconnected from the server. If an error occurs in one of the steps above, the information that generates the error is returned to the client and is output by the display. For the user, these processes are done by HTTP itself, the user just click with the mouse, waiting for information to display it. Many HTTP traffic is initialized by a user agent and includes a request to request resources on the source server. The simplest scenario might be a separate connection between the user agent and the server. On the Internet, HTTP traffic typically occurs on top of a TCP/IP connection. The default port is TCP 80, but the other ports are also available. However, this does not imply that the HTTP protocol can be completed on top of other protocols on the Internet or other networks. HTTP is only indicative of a reliable transmission. This process is like we call the order, we can call the merchant, tell him we need what specifications of goods, and then the merchant tell us what goods are in stock, what goods are out of stock. These, we are by telephone line by telephone contact (HTTP is through TCP/IP), of course we can also by fax, as long as the merchant also has fax. further study reference address: http://blog.csdn.net/lmh12506/article/details/7794512

HTTP protocol, HTTP protocol principle analysis

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.