Graphic HTTP reading notes 16-1-26

Source: Internet
Author: User
Tags ack rfc domain name lookup ftp protocol

Diagram http
1.4.2 to ensure the reliability of the HTTP protocol
By level, TCP is located in the transport layer, providing a reliable byte stream service
The so-called word-stream service, refers to the convenience of transmission, the large chunks of data into packet-based packet management, and reliable transmission service refers to the data can be accurately and reliably transmitted to each other.
That is, the TCP protocol splits the data in order to make it easier to transfer big data, and the TCP protocol confirms that the data is eventually delivered to the other party.
To confirm that the data is delivered to the target, the TCP protocol uses a three-time handshake strategy. The handshake process uses the TCP flag bit, SYN (synchronize) and ACK (acknowledgement)

The sending side first sends a packet with a SYN flag to the other party, receiving a packet with the SYN/ACK flag to convey a confirmation message. Finally, the sender sends back a packet with an ACK flag to end it on behalf of the handshake.
The TCP protocol sends the same packets again in the same order, if a certain period of time is interrupted during the handshake.

Three-time handshake
The packets marked with SYN are sent to you.
Understand, I received the packet you sent me (and sent a packet labeled Syn/ack)
Yes, send a packet marked with an ACK

In addition to the three handshake, the TCP protocol also has other means to ensure the reliability of the communication

1.5 DNS service for Domain name resolution
The DNS (domain Name System) service is an application-level protocol that is the same as the HTTP protocol, which provides parsing services between domain names and IP addresses.
The computer can either be given an IP address, or it can be given a hostname and a domain name, such as: www.jrange.com

DNS protocol provides domain name lookup IP address, or reverse the service from IP to fine-check domain name
I want to visit www.jrange.com and tell me his domain name.
www.jrange.com the corresponding IP address is 20x.189.103.xxx
Send an access request to 20x.189.103.xxx

1.6 The relationship between the various protocols and the HTTP protocol
You need to fill out a diagram here, "Graphical http" p.31
1.7 URI & URL
URI (uniform-provides a uniform format for easy handling of many different types of resources without the need to identify resource-specific access patterns based on context, adding new protocol schemes is easier
resource-definition is "can identify anything", in addition to document files, images or services, etc. can be distinguished from other types, all can be used as resources. Resources can be not only single, but also a majority of aggregates
Identifier0 represents an identifiable object, also known as an identifier) Uniform Resource Identifier
In summary, a URI is the location identifier of a resource represented by a protocol scheme.
Protocol scheme refers to the name of the protocol type used to access the resource
When using the HTTP protocol, the Protocol scheme is HTTP. Besides, there are ftp,mailto,telnet,file and so on.
URI identifies an Internet resource with a string
URL () Uniform Resource Locator that represents the location of the resource (where the Internet is located,)

Http://user:[email Protected]:80/dir/index.htm?uid=1#ch1
Use http: or https: To specify the protocol type when obtaining access to a resource, such as protocol scheme name. Case insensitive, with a colon (:) appended to the last

Login information (authentication) specify user name and password as necessary logon information for accessing resources from the server side (identity authentication)
The server address using an absolute URI must specify the server address to be accessed. The address can be a DNS-resolvable name, a IPV4 address name such as 192.168.1.1, or a IPV6 address name like [0:0:0:0:0:0:0:0:1]
Server port number Specifies the network port number of the server connection. Default port number is automatically used when user omits
A hierarchical file path specifies the file path on the server to locate the specified resource. Similar to the UNIX system's file directory
Query strings can be used to pass in arbitrary parameters to a resource in a file path that is already specified
Fragment identity using fragment identifiers can often be used to mark a child resource in a obtained resource

Note Not all applications are RFC-compliant
Some of the documents used to develop the HTTP protocol technical standards, they are called RFC request for Comments request a revised opinion


Chapter II Simple HTTP protocol

2.1 HTTP protocol for communication between client and server side
The HTTP protocol is the same as many other protocols within the TCP/IP protocol family for communication between the client and server
One end of a resource that requests access to text or an image is called a client, and one end of the resource response is called a server-side

The HTTP protocol specifies that the request is made from the client and the last server-side response should be requested and returned. Must first establish communication from the client, and the server will not respond until the request is received.
(1) Send request
get/http/1.1
Host:jrange.com
(2) Send response
http/1.1 OK
Date:tue, 06:50:15 GMT
content-length:362
Content-type:text/html
。。。
Get/index.htm http/1.1
Host:jrange.com

A get at the beginning of the start line represents the type of the requested access server, called a method. The subsequent string/index.htm indicates the resource object requested for access, also called the Request URI (Request-uri).
This request content means: Request access to a/index.htm page resource on an HTTP server
The request message is composed of the request method, the request URI, the protocol version, the optional request header field, and the content entity.

Post/form/entry http/1.1
Method URI Protocol version

Host:jrange.com
Conntection:keep-alive
content-type:application/x-www-form-urlencoded
Content-length:16
Request Header Field

name=ueno&age=37
Content entity

The server that receives the request returns the processing result of the requested content as a response.
http/1.1 OK
Reason phrase for protocol version status Code status code

Date:tue, 06:50:15 GMT
content-length:362
Content-type:text/html
Response Header Field

...
Subject

The http/1.1 at the beginning of the start line represents the HTTP version of the server.
Next to the number OK, the status code and the reason phrase (reason-phrase) for the requested processing result shows the date and time the response was created, and is a property within the header field.
Then the partition is separated by a blank line, and then the content is called the body of the resource Entity (entity body).

The response message is essentially a protocol version, a status code (a numeric code that indicates a successful or unsuccessful request), a reason phrase to interpret the status code, an optional response header field, and an entity body composition.

2.3 HTTP is a protocol that does not save state
HTTP is a non-persisted state, which is a stateless protocol. The HTTP protocol itself does not save the communication state between the request and the response. That is, at the HTTP level, the protocol does not persist for sent requests or responses
With the HTTP protocol, whenever a new request is sent, a corresponding new response is generated, and the protocol itself does not save all previous requests or responses.
This is to deal with a lot of transactions more quickly, to ensure the scalability of the Protocol, specifically the HTTP protocol design is so simple.

Although http/1.1 is a stateless protocol, the introduction of cookie technology in order to achieve the desired state function
2.4 Request URI Location resource
The HTPP protocol uses URIs to locate resources on the Internet. When a client requests access to a resource and sends a request, the URI needs to be included as the request URI in the request message.
The full request URI
GET http://jrange.com/index.htm http/1.1

Specify the network domain name or IP domain name in the header field of the host
Get/index.htm http/1.1
Host:jrange.com

In addition, if you are not accessing a specific resource but initiating a request to the server itself, you can replace the request URI with a *.
OPTIONS * http/1.1

2.5 HTTP method for informing server intent
Get Get Resources
Used to request access to a resource that has been identified by the URI, and the specified resource returns the content of the response after the server-side resolution. That is, if the requested resource is text, it is returned as it is, and if it is a program like the GCI (Universal Gateway Interface), the executed output is returned
POST Transport Entity Body
Although it is possible to transfer entities using the Get method, it is generally not transmitted using the Get method, but by the post method. Although the Post method is similar in function to the Get method, the main purpose of the post approach is not to get the principal entity of the response.
PUT Transfer File
Just like the FTP protocol file upload, it requires that the contents of the file be included in the subject of the request message, and then saved to the location specified by the request URI.
However, given the ability to put a method without a validation mechanism, anyone can upload files, there are security issues, so the general Web site does not use this method. The use of put is likely to be open in conjunction with the validation mechanism of Web applications, or the architecture design of a similar site with rest standards.
HEAD gets message header
The head method, like the Get method, simply does not return the main part of the text. Used to confirm the validity of the URI and the date and time of the resource update.
Returns the response header for the resource
Delete deleting files
The method used to delete a file, as opposed to a put. The same is not with the verification mechanism, so generally do not open.
Options how to ask for support
Used to query the supported methods of resources specified for the request URI (returns the methods supported by the server, such as GET, POST, HEAD, OPTIONS)
Trace Trace Path
A method that allows the server to loop back the previous request communication to the client
When the request is sent, the number is filled in the Max-forwards header field, minus one for each server, and when the value is just reduced to 0 o'clock, the transmission is stopped, and the server that receives the request returns the response of the status code of OK.
Trace is generally not commonly used, and it is easy to trigger xst (cross-site tracking) attacks, so it is even less common (!!). The original request may be tampered with when routing from the Proxy server!! )
Connect requires a tunneling protocol for proxy Connection
The method requires tunneling from the proxy server to implement TCP communication with tunneling protocol. The communication content is encrypted and transmitted via the network tunnel using SSL (secure Sockets layer Secure socket) and TLS (Transport Layer Security Transport Layer Secure) protocol.
CONNECT Proxy Server Name: Port number HTTP version
Example:
Request
CONNECT proxy.jrange.com http/1.1
Host:proxy.jrange.com
Response
http/1.1 OK (after entering the tunnel)
2.6 How to use the release command
A request message is sent to the resource specified by the request URI, using a command called a method. The function of a method is to specify that the requested resource produces some behavior as expected.
2.7 Long-lasting connectivity saves traffic
In the initial version of the HTTP protocol, a TCP connection is disconnected every time HTTP communication is made.
TCP establishes a connection
Syn
Syn/ack
Ack
HTTP request
HTTP response
FIN
Ack
FIN
Ack
Disconnecting a TCP connection
2.7.1 Persistent Connection
In order to solve the above TCP connection problem, http/1.1 and part of http/1.0 came up with a persistent connection (HTTP persistent Connections also known as HTTP keep-alive or HTTP connection reuse)
The persistent feature is that there is no end to disconnect, and the TCP connection state is maintained.
Advantage: Reduces the additional overhead of TCP connection duplication and disconnection, and reduces server-side load. In addition, the part of the time to reduce the overhead is that HTTP requests and responses can end earlier so that the Web page is displayed more quickly
There is no doubt that the client needs to support persistent connections in addition to the server side
2.7.2 Pipeline
Persistent connections make it possible for most requests to be routed in a pipelined manner. Once a request is sent, it waits and receives a response to send the next request, and now sends the request without waiting for a response
This allows multiple requests to be sent in parallel, without having to wait for a response one after the other.
2.8 State Management Using cookies
Cookie technology controls the state of the client by writing cookie information in the request and response messages.
The cookie notifies the client to save the cookie based on a header field message in the response message sent by the server segment called Set-cookie. When the next client sends a request to the server, the client automatically adds the cookie value to the request message and sends it back.
After the server-side discovery of the cookie sent by the client, go back to check which client sent the connection request, then compare the records on the server, and finally get the status information.

Chapter III HTTP messages in HTTP messages
3.1HTTP messages
The information used for HTTP protocol interaction is called an HTTP message, and the HTTP message on the request side is called the request packet, and the response message is called the response message.
The HTTP message is a string literal consisting of multirow (line break with CR+LF) data
3.2 Structure of request message and response message
The request line contains the requested method, the request URI, and the HTTP version
The status line contains status codes, reason phrases, and HTTP versions that indicate the result of the response
The header field contains the various conditions and properties of the request and response headers generally have 4 header fields, communication header, request header, response header, entity header
Other. Undefined headers (cookies, etc.) that may contain an HTTP RFC
3.3 Encoding increases transfer rate
By encoding during transmission, a large number of access requests can be handled efficiently, but the encoded operation requires the computer to complete and therefore consumes more CPU resources.
3.3.1 The difference between the message body and the entity body
Message
is the basic unit of HTTP communication, consisting of 8 bits of byte stream (octet sequence where octet is 8 bits), transmitted via HTTP communication
Entity
Payload data (supplementary items), as a request or response, is transmitted, with the contents of the entity header and the entity body composition.
3.3.2 content encoding for compressed transmissions
Content encoding is the encoding format that indicates the application is on the entity content and keeps the entity information compressed as it is. The content-encoded entity is received by the client and is responsible for decoding.
The common content encoding has the following several
Gzip GUN Zip
Compress standard coding for UNIX systems
Deflate Zlib
The identity is not encoded
chunked transfer encoding for 3.3.3 split transmission
In HTTP communication, the browser cannot display the requesting desktop until all the requested encoding entity resources have been transferred. When transferring large volumes of data, by splitting the data into chunks, the browser is able to progressively display the page
The function of the block of the entity body is called the Chunked Transmission code (Chunked Transfer Coding)
Entity principals that use chunked encoding will have to accept that the client is responsible for decoding, reverting to the entity body before encoding
3.4 Multi-part object collection for sending multiple data
When sending a message, we can write text in the message and add a multi-point attachment. This is due to the use of MIME (Multipurpose Internet Mail Extensions Multipurpose Internet Mail Extension), which allows messages to handle many different types of data, such as text, pictures, video, and so on.
Accordingly, the HTTP protocol also incorporates a multi-part object collection that sends a message body that can contain multiple types of entities, typically used when uploading images or text files.
The multi-part object collection contains the following objects:
Multipart/form-data
Using the Web Form file when uploading
Multipart/byteranges
Status Code 206 (partial content, partial contents) The response message is used when reading a range of content.
When using a multipart object collection in an HTTP message, you need to add Content-type to the header field.
Use the boundary string to divide the various entities specified by the multipart object collection. Inserts the "---" tag before the starting line of the individual entities specified by the boundary string, and ends with a "--" tag at the end of the string corresponding to the multipart object collection.
3.5 Getting range requests for partial content
To implement this feature, you specify the scope of the download entity. A request like this to a specified range is called a range request.
Get/tip.jpg http/1.1
Host:www.jrange.com
Range:bytes = 5001-10000

Range:bytes = 5001-
Range:bytes = -3000l, 5000-7000
For a range request, the response returns a response message with a status code of 206. For a range request of multiple ranges, the response will return a response message after the header field Content-type indicates multipart/byteranges
If the server side is unable to respond to the range request, the status code is OK and the full entity content
3.6 Content Negotiation Returns the most appropriate content
When the browser's default language is English, when accessing the same URI Web page, the corresponding English version of the Web page is displayed, a mechanism called content negotiation.
The content negotiation mechanism refers to the client and server to negotiate the content of the response, and then provides the most appropriate resources for the client. Content negotiation is judged by the language of the response resource, the character set, the encoding method, and so on.
Accept
Accept-charset
Accept-encoding
Accept-language
Content-language
Server-driven negotiation (server-driven negotiation)
The server-side content negotiation, with the request header field as a reference, on the server side automatically processing.
Client-side driver negotiation (agent-driven negotiation)
There is a way for the client to negotiate content, and the user selects it manually from the list of selectable options displayed in the browser. You can also use JavaScript scripts to automatically select on a Web page.
Transparent negotiation (Transparent negotiation)
Server-side and client-driven binding, which is a method of content negotiation by both the server side and the client.

Graphic HTTP reading notes 16-1-26

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.