Transferred from: https://www.jianshu.com/p/0e5b946880b4#
HTTP
The address format for HTTP is as follows:
"http:" "//" host [ ":" port ] [ abs_path [ "?" query ]]协议和host不分大小写
HTTP message
An HTTP message may be a request or response message, both types of messages are from the start line (Start-line), 0 or more header fields, a blank line representing the end of the header field (that is, a blank line prefixed with CRLF), A message body (message-body) that may be empty. A qualified HTTP client should not add extra CRLF to the message header or tail, and the server will ignore the characters.
The value of the header does not include any leading or subsequent LWS (linear whitespace), and linear whitespace may appear before the first non-whitespace character of the field value (Filed-value) or after the last non-whitespace character. The leading or subsequent LWS may be removed without altering the semantics of the field values. Any LWS that appears between filed-content may be replaced by an SP (space). The order of the header fields is not important, but it is recommended to put the usual headers in front (as the Protocol says).
Request message
This defines the HTTP Request message in RFC2616:
Request = Request-Line *(( general-header | request-header(跟本次请求相关的一些header) | entity-header ) CRLF)(跟本次请求相关的一些header) CRLF [ message-body ]
用图来表示如下:
The request message for an HTTP starts with a line of requests, starting with the header on the second line, followed by a blank line, indicating the end of the header, and finally the message body.
The request line is defined as follows:
//请求行的定义Request-Line = Method SP Request-URL SP HTTP-Version CRLF//方法的定义Method = "OPTIONS" | "GET" | "HEAD" |"POST" |"PUT" |"DELETE" |"TRACE" |"CONNECT" | extension-method//资源地址的定义Request-URI ="*" | absoluteURI | abs_path | authotity(CONNECT)
The header used in the request message can be either General-header or Request-header,request-header (explained in the rear). One of the more special is that Host,host will use the Reuqest URI as the recipient of the request message to determine the criteria for requesting the resource, as follows:
If the Request-uri is an absolute address (Absoluteuri), then the host in the request exists in Request-uri. Any Host header field value that appears in the request should be ignored.
If Request-uri is not an absolute address (Absoluteuri), and the request includes a host header domain, the host is determined by that host header domain value.
If the host defined by rule 1 or Rule 2 is an invalid host, it should be returned with a 400 (Error request) error message.
Response message
The response message is almost identical to the request message and is defined as follows:
Response = Status-Line *(( general-header | response-header | entity-header ) CRLF) CRLF [ message-body ]
例如:
As you can see, except the header does not use Request-header, only the first line is different, the first line of the response message is the status line, which contains the famous return code .
The content of Status-line first is the version number of the protocol, followed by the return code, and finally the interpretation of the content, they have a space between them, the end of the line with a carriage return line break as the end. Defined as follows:
Status-Line = HTTP-Version SP Status-Code SP Reason-Phrase CRLF
Return code
The return code is a 3-digit, first-bit-defined return code category with a total of 5 categories, which are:
1xx: Informational - Request received, continuing process - 2xx: Success - The action was successfully received, understood, and accepted - 3xx: Redirection - Further action must be taken in order to complete the request - 4xx: Client Error - The request contains bad syntax or cannot be fulfilled - 5xx: Server Error - The server failed to fulfill an apparently valid request
RFC2616 then gives the extension of a series of return codes, which we would normally use, but those are just examples, HTTP1.1 does not force the communication parties to abide by these extended return codes , Communication Parties in the implementation of the return code only need to adhere to the above definition of the 5 categories defined, meaning that the first bit of the return code to be strictly as described in the document, the other casually defined.
Anyone who receives a non-known return code, XYZ, can treat it as a x00. For a response message that does not recognize a return code, it cannot be cached.
Header
There are 4 header types defined in RFC2616, where the request header can be extended (the trusted extension waits until the version of the Protocol is updated), and if the recipient receives a request header that is not known to the communicating parties, the header will be treated as the entity header. The 4 head types are as follows:
Generic header fields: available for request or for response headers, but not as entity headers, only as headers for messages.
general-header = Cache-Control 14.9 | Connection ; Section 14.10 | Date ; Section 14.18 | Pragma ; Section 14.32 | Trailer ; Section 14.40 | Transfer-Encoding ; Section 14.41 | Upgrade ; Section 14.42 | Via ; Section 14.45 | Warning ; Section 14.46
Request Header: The header used by the requested initiator to change the behavior of the request.
Request-header = Accept; Section14.1| Accept-charset; Section 14.2 | accept-encoding; Section14.3| Accept-language; Section 14.4 | Authorization; Section14.8 | Expect; Section 14.20 | from; Section 14.22 | Host; Section 14.23 | If-match; Section 14.24 | If-modified-since; Section 14.25 | If-none-match; Section 14.26 | If-range; Section 14.27 | If-unmodified-since; Section 14.28 | Max-forwards; Section 14.31 | Proxy-authorization; Section 14.34 | Range; Section 14.35 | Referer; Section 14.36 | TE; Section 14.39 | User-agent; Section 14.43
-
Response Header (Response header fields): Used by the server to further describe the resource.
response-header = accept-ranges; Section 14.5 | Age; Section 14.6 | ETag; Section 14.19 | Location; Section 14.30 | Proxy-authenticate; Section 14.33 | Retry-after; Section 14.37 | Server; Section 14.38 | Vary; Section 14.44 | Www-authenticate; Section 14.47
Entity header fields: If the message has a message body, the entity header is used as meta information, and if there is no message body, it is to describe the requested resource information.
entity-header = Allow 14.7 | Content-Encoding ; Section 14.11 | Content-Language ; Section 14.12 | Content-Length ; Section 14.13 | Content-Location ; Section 14.14 | Content-MD5 ; Section 14.15 | Content-Range ; Section 14.16 | Content-Type ; Section 14.17 | Expires ; Section 14.21 | Last-Modified ; Section 14.29 | extension-header
The body of the message, and the entity body (the object body)
If there is transfer-encoding head, then the message body decoded is the entity body, if there is no transfer-encoding head, the message body is the entity body.
message-body = entity-body <entity-body encoded as per Transfer-Encoding>
In the request message, the message header contains Content-length or transfer-encoding, which identifies a message body that is behind. If the requested method should not contain the message body (such as option), then the request message must not contain the message body, even if the client sends in the past, the server does not read the message body.
In the response message, the presence of the message body is determined jointly by the request method and the return code. Like 1xx,204,304 does not carry the message body.
The length of the message body
The length of the message body is determined by a few rules, which are executed sequentially:
All response messages that should not return content should not have any message body, and the message will be considered terminated at the first empty line.
If the message header contains Transfer-Encoding
, and its value is not identity
, the length of the message body is determined using a way of chunked
decoding until the connection terminates.
If there is a message in the header Content-Length
, then it represents the entity-length
and transfer-length
. If it contains both Transfer-Encoding
, then entity-length
and transfer-length
may not be equal, then it is Content-Length
ignored.
If the media type of the message is and is multipart/byteranges
transfer-length
not specified, then the transmission length is defined by the media itself. Typically, the HTTP1.1 is defined as a format, and if a Range header field appears in the client request and has multiple byte range (byte-range) indicators, this means that the client can parse the multipart/byteranges
response.
If it is a response message, it can also be disconnected by the server, ending as the body of the message.
The entity body is obtained from the body of the message, its type is defined by two headers, Content-Type
and Content-Encoding
(usually used for compression). If there is an entity body, then must have Content-Type
, if not, the receiver will need to guess, can not guess is used application/octet-stream
.
HTTP connection
HTTP1.1 connection By default using persistent connection (persistent connection), persistent connection refers to, sometimes the client will need to in a short time to the server to request a large number of related resources, if not continuous connection, then each resource to establish a new connection, HTTP is used at the bottom of the TCP, then each time to use three handshake to establish a TCP connection, will cause great waste of resources.
Continuous connectivity can bring a number of benefits:
- With fewer TCP connections, the pressure on all parties to the communication is smaller.
- You can use pipelines (pipeline) to transfer information so that the requester does not have to wait for the result to send the next message, and is more fully used for a single TCP.
- Less traffic
- Sequential requests have a smaller delay.
- You do not need to reestablish a TCP connection to transmit the error, close the connection, and so on.
The HTTP1.1 server uses TCP traffic control to control HTTP traffic, and the HTTP1.1 client receives the error message sent from the server connection and closes the link immediately. There are a lot of details about HTTP connections, and then we'll go over them.
WebSocket
Only from the time of the RFC release, WebSocket to a lot more recently, HTTP 1.1 is 1999, WebSocket is 12 years later. The WebSocket agreement begins by saying that the purpose of this agreement is to solve the problem of having to initiate multiple HTTP requests and long rotation when a browser-based program needs to pull resources ... and create
HTTP protocol and WebSocket Protocol (i)