HTTP Diagram Reading notes

Source: Internet
Author: User
Tags error status code session id ssl connection

Simple understanding of HTTP Basics

Before understanding the HTTP protocol, we first understand the TCP/IP reference Model, the TCP/IP reference model is divided into four layers: the application layer, the transport layer, the network layer, the link layer (data link layer).

Application tier: Provides the services needed for different Web applications.

Transport Layer: Provides end-to-end communication/transport capabilities for application layer entities to ensure sequential delivery of packets and data integrity.

The network layer: handles packets flowing over the network, which contain protocols that involve the logical transmission of packets over the entire network.

Link Layer: Monitors the data exchange and handles the hardware portion of the network connection.

The TCP/IP communication transport stream is as follows:

HTTP encapsulation processing at each layer:

Protocols/services closely related to the HTTP protocol: Ip,tcp,dns

IP protocol is responsible for the transmission of packets, of course, this need to cooperate with the IP address and MAC address, the communication between IP depends on the MAC address, which involves the ARP protocol to resolve the address.

TCP provides a reliable byte stream service that splits large chunks of data to be sent into small packets for easy transmission, and the protocol confirms that packets are delivered to the destination.

DNS service is responsible for resolving domain names

URI (Uniform Resource Identifier) and URL (Uniform Resource Locator)

URI: A string that identifies the name of an Internet resource. Composition: Host name (with port number) + relative path + identifier

URL: A concise representation of the location and access methods of resources available from the Internet, which is the address of standard resources on the Internet. Composition: protocol + host name (with port number) + relative path

Difference: The URI represents the location where the requested resource exists on the Internet, and the URL is a subset of the URI that shows how to access the resource at the same time as the location of the requested resource. Reference resources: Differences between URLs and URIs

Cookies

The HTTP protocol is used for communication between the client and the server through a request-and-response exchange, and it is a stateless protocol that does not save the communication state between the request and the response (the request cannot be processed based on the previous request), but in order to be able to have a saved state function, Introduced the technology of cookies.

Persistent connections

HTTP initial version, each time the HTTP request will be disconnected a TCP connection, this situation in the early transmission of text is very small, but also do not feel how, but with the progress of the times, the need to transfer more and more content, and the content is getting bigger, Disconnecting requests after each connection greatly increases the overhead of the traffic. Fortunately, since http/1.1 and part http/1.0, with a long-lasting connection such a magical thing, it stipulates that as long as either party does not explicitly make a disconnection, then maintain the TCP connection state. During a sustained TCP connection, HTTP requests can be made multiple times to transmit the required content.

http/1.1 maintains a persistent connection by default, there is a connection:keep-alive attribute in the header information of the HTTP, and we can also view the status of this property and HTTP request information through the network panel of the browser development tool:

How to turn off persistent connections : Set the Connection property to close on the response header.

Thanks to the persistent connection, HTTP is pipelined, allowing multiple requests to be sent in parallel simultaneously without waiting for a response one after the other.

Content structure of the HTTP request

The HTTP protocol interacts with information called HTTP messages, and the structure of the HTTP message is shown in the following figure:

Except for the empty line (carriage return, line break), it is roughly divided into the message header and the message body. The header contains the request line (method of request, URI, HTTP version) and status line (response status Code, reason phrase, HTTP version), header field (request and response conditions and attributes), other (undefined header).

Header field

The first field specifies how the client handles the request and how the server handles the response, which can be divided into four types: the request header (the header of the request message), the response header (the header used for the response message), the general header (the header for the request and response), and the entity header (the header used by the Message entity section).

http/1.1 header Field List

Generic header Field

The header field name                describes the behavior of            the Cache-control control cache connection The                header, the management of the connection                    date time of the creation of the message pragma                    message instruction trailer                    Header at the end of the message transfer-encoding        specifies the transmission encoding of the message body upgrade                    upgrade to other protocols via                        Proxy Server information warning                    error notification

Request Header Field

The first field name               describes the                   media type that the Accept user agent can handle Accept-charset           preferred character Set accept-encoding           priority content encoding Accept-language           Preferred language (natural language) Authorization           Web authentication information Expect                   expects the specific behavior of the server from                   the user's e-mail address to the                   server where the host requested the resource If-match               Compare entity tags (ETag) if-modified-since       Compare resource update time If-none-match           Compare entity tags (contrary to if-match) If-range               When the resource is not updated, the range request for the entity byte is sent if-unmodified-since       the update time of the comparison resource (as opposed to if-modified-since) Max-forwards           Maximum transfer Hop       -on proxy-authorization Proxy server requires authentication information for the client a range                   entity's byte range request Referer the                   original fetch method for the URI in the request Te                       Transfer encoding Priority user-agent               HTTP client program information

Response Header Field

The header field name               indicates           whether Accept-ranges accepts a byte range request                       an age extrapolation resource creates                   a matching information               for the time ETag resource location Redirect the client to the specified uriproxy-authenticate       Proxy Server Authentication information to the client Reter-after               request installation information for the server HTTP server for the time when the request is initiated                   vary                   Proxy Server cache management information Www-authenticate       Server authentication information to the client

Entity header Field

The header field name               describes the                   HTTP methods that allow resources can support content-encoding the       applicable encoding for entity principals Content-language       Natural language of the entity body content-length the size of the           entity body (in bytes) content-location       The uricontent-md5               of the corresponding resource Report Digest of entity body to content-range the position range of the entity body           content-type The           media type of the entity body expires                   the date time that the entity body expires last-modified           Last modified date time of the resource

In addition, some of the header fields defined in other RFCs, such as cookies, Set-cookie, and content-disposition, are also often used.

Transfer encoding

When HTTP transmits data, it can transmit the original data or encode it during transmission to increase the transfer rate. Through the transmission of the encoding processing, can effectively handle a large number of access requests. The common content encoding has the following several

· Gzip (GUN Zip)
· Compress (standard compression of UNIX systems)
· Deflate (zlib)
· Identity (not coded)

Multi-Part object collection

The multi-part object collection is adopted in the HTTP protocol, allowing multiple types of entities to be contained within the sending message body. used when uploading files or images, you can set the Content-type property to specify them. Several common forms are as follows:

Text: Used to standardize the presentation of textual information, text messages can be in multiple character sets and or multiple formats

Multipart: Multiple parts used to connect the body of a message form a message that can be of different types of data

Application: Used to transfer application data or binary data

Range request

Implementing this feature requires specifying the scope of the downloaded entity, such as: A 1000-byte file, a 300-3000-byte range of resources, you can set range:bytes=300-3000, you want to fetch 300-3000 bytes and 5000 bytes to the last resource, You can set range:bytes=300-3000,5000-

Content negotiation

The content negotiation mechanism refers to the client and the server to negotiate the content of the response resource, and then provide the most suitable resources for the customer, and the content negotiation will be judged by the language, character set and encoding method of the response resource. The following header fields are involved:

· Accept

· Accept-charset

· Accept-encoding

· Accept-language

· Content-language

Content negotiation technology is divided into three different types

Server-driven negotiation: The service side takes the requested header field as a reference, processes it on the service side, and returns the corresponding resource.

Client-side driver negotiation: The user selects manually through the optional list provided by the browser, or by using the JS script on the Web page itself.

Transparent negotiation: A combination of server-driven negotiation and agent-driven negotiation, when a cache is provided with a series of available representations that make up a response, and the differences in dimensions can be fully cached, the cache becomes capable of performing server-driven negotiation on behalf of the source server for subsequent requests for that resource

Content negotiation can be consulted: content negotiation

http method and Status code

HTTP method

HTTP also contains methods to specify that the requested resource generates some behavior as expected. For these methods, the most used is get and post, everyone must be very familiar with ~

Methods supported by http/1.1 and http/1.0

Method              Description                    HTTP version (support) Get              resource Get                 1.0, 1.1POST             Resource entity Body              1.0, 1.1PUT              Transfer File                 1.0, 1.1   HEAD             get the message header              1.0, 1.1DELETE           Delete file                 1.0, 1.1OPTIONS          Inquiry Support method            1.1TRANCE           trace Path                 1.1CONNECT          requires Tunneling Protocol Connection Agent     1.1 link             establishment and resource Connection       1.0 (OBSOLETE) UNLINK           disconnection Relationship              1.0 (OBSOLETE)

HTTP status Code

HTTP status code indicates the return result of the client HTTP request, through the status code, the user can know the HTTP request whether there is a problem, the problem is, the following simple list some HTTP status code:

Status Code category       status code Nature 1XX             Informational Status Code 2XX             success Status code 3XX             redirect Status code 4XX             Client Error status code 5XX             Server Error status code

Some common status codes are:

                      the normal processing request 204  No Content              server received the request and successfully processed, but the returned response message does not contain the entity's body part 206  partial content         client makes scope request, The server successfully performs this part of the range of Get requests 301  moved Permanently       permanent REDIRECT 302  found                   temporary redirection 303               See other Indicates that because another URI exists for the resource for the request, the Get method should be used to direct the requested resource 304 not  modified the            client to send the requested request, and the server allows access to the resource, but satisfies the condition 401  Unauthorized            indicates that the request sent requires authentication information that is authenticated by HTTP 403  Forbidden               the access to the requested resource is denied by the server 404 Not  found               The server could not find the requested resource.   Error 503 Service  unavailable     service side is unable to process request state when executing request on internal

HTTP Proxy and cache

Agent

A proxy is an application that has a forwarding function, and the client's request is forwarded to the server, and the response to the server is forwarded to the client. The proxy does not change the URI of the request and is sent directly to the server holding the resource.

Multiple proxy servers can be cascaded during HTTP communication, and the Via header field is appended to mark the host information that passes through.

Cache

A cache is a copy of a resource that is saved within a proxy server or client local disk, and uses caching to reduce access to the source server to save traffic and communication time, or to achieve a better interaction experience.

If the requested resource is already cached, it is returned directly to the client by the cache server, or the client reads directly from the local disk. The cache can be set to a valid time, and when the cache expires, the client/cache server can re-request new resources like the source server.

HTTP Security Upgrade--https

After talking about some of the advantages of HTTP, take a look at the disadvantages of HTTP

· Communication using plaintext (unencrypted), content may be tapped

· The request/response is spoofed without verifying the identity of the communicating party

· Unable to prove the integrity of the message, there is a possibility of tampering

At any corner of the internet there is a risk of eavesdropping on communication content.

According to the mechanism of TCP/IP protocol, communication content may be subject to peep on all communication lines. Even if the communication is encrypted, it will be peered into the communication content, but only after the encryption, it is possible that people can not decipher the correct meaning of the message message, the content of the encrypted message itself will be seen.

In general, eavesdropping is done by collecting packets that flow over the Internet, which can be achieved by grabbing packets and sniffing tools, which makes it possible to steal some of the public WiFi accounts.

This can also be used to encrypt the message body (transmit content) for the plaintext transmission.

For authentication This is possible by installing certificates locally, storing authentication information, etc.

Hash value check, digital signature, etc. for ensuring information integrity, Md5/sha-1

HTTP = = HTTPS

HTTP does not have an encryption mechanism, but can be passed and SSL (Secure Sockets Layer ... Label reading pause) or a combination of TLS (Security Layer Transport Protocol), the use of SSL to establish a secure communication line, you can be on this line of cheerful HTTP communication. Since the combination of the Ssl,http upgrade to HTTPS (or HTTP over SSL), this is not yet a complete https.

Full HTTPS = HTTP + encryption + authentication + integrity Protection

A full HTTPS request

1. Clients send client Hello message to start SSL communication, the message contains the specified version of SSL supported by the client, the list of cryptographic components, etc.

2. When SSL communication is available on the server, the serve Rhello message is used as the answer

3. The server sends the certificate message, the message contains the public key certificate

4. Service side sends server Hello done message notification client, the initial phase of the SSL Handshake negotiation Section ends

5.SSL after the first handshake, clients respond with client Key exchange messages, which contain random cipher strings used in communication encryption

6. The client sends a change Cipher spec message that indicates that the communication after the message is encrypted with a cryptographic key that follows the random cipher string in the previous step

7. The client sends a finished message, which contains the overall checksum value of all messages connected to the present

8. Server sends change Cipher spec message

9. Sending finished messages to the server

After the 10.Finished message exchange is complete, the SSL connection is established.

11. Application layer protocol communication, HTTP

12. Client disconnects, send close notify

WebSocket and http/2.0

WebSocket

WebSocket implements full-duplex communication between the Web client and the server, and once the Web server and the client establish a communication connection between the WebSocket protocol, all subsequent communications are dependent on this proprietary protocol.

WebSocket has the push function, the server can push the data directly to the client, do not have to wait for the client's request, because the websocket keeps the connection state, and the header information is small, so that the traffic is correspondingly reduced.

In order to achieve websocket communication. Need to use the above mentioned HTTP header field upgrade, to inform the service side communication protocol changes, when the successful handshake established WebSocket connection, the communication is no longer using HTTP data frame, and the use of WebSocket independent data frame.

http/2.0

Core Strengths/Features

Multiplexing: Multiple requests are completed concurrently through a TCP connection (http/1.1 pipelined response to multiple requests is blocked, http/2.0 resolves this issue and supports priority and traffic control)

Head Compression: Packet header compression processing for smaller number of traffic

Server-side push: The server can push resources to the client faster

Semantic improvements: Transferring data in binary format

http/2.0 Reference: English version in Chinese and English

The attack technology of web

Server-Targeted active attacks, representative SQL injection and OS command injection, SQL injection refers to the attacker through direct access to the Web application, the attack of SQL code into the service side to execute the database to obtain the required data information or tamper with the database information (the way the SQL statement generated by the vulnerability) OS command attack refers to the purpose of executing an illegal operating system command on the server to achieve the attack.

The server-targeted passive attack, with the following pattern:

1. An attacker induces a user to trigger an already set trap to initiate an HTTP request to send an embedded attack code

2. http that contains the attack code is sent to the server and allows

3. After running the attack code, a security vulnerability Web application becomes an attacker's springboard, resulting in the theft of personal information (the knowledge of the network security class is all back to the teacher ...). At first saw these, a face of a confused force ...)

A client-targeted proactive attack, a representative cross-site Footstep attack (XSS), an attack that runs an illegal HTML tag or JavaScript code in a user's browser through a Web site that has a security vulnerability that can obtain user personal information, etc.

There are HTTP header injection attacks, message header injection attacks, directory traversal attacks, vulnerabilities contained in remote files, etc.

Security vulnerabilities caused by Setup or design

Forced browsing, from files placed in a public directory on the Web server, to the disclosure of personal information/internal file information, by browsing those files that were otherwise involuntary

A vulnerability caused by throwing an error message exposes the system to a point of failure, providing an attacker with a breakthrough

Open redirection, redirection of any URL, allows an attacker to induce a user to a malicious Web site

Security vulnerabilities due to session management negligence

Session hijacking, the attacker gets the user session ID by some means, and uses this session ID to impersonate the user for the purpose of the attack.

Some ways an attacker could obtain a session ID:

· The session ID is inferred from the informal generation method

· Stealing session IDs through eavesdropping or XSS attacks

· Forcibly acquiring session IDs through session pinning attacks

Session fixed attack, the approximate mode is: The attacker visits the site to get an unauthenticated session ID, set traps to force the user to use this session ID to authenticate, once the user triggers the trap and complete authentication, the attacker can use the identity of the user to successfully log on to the site

Cross-site request forgery, where an attacker forces a set of traps to make unexpected information about a completed authenticated user in some state updates

Other security vulnerabilities

Password cracking, access to password, breakthrough authentication (through the network password trial or decryption of the encrypted password), password cracking such as dictionary attacks, rainbow tables, access keys, encryption algorithm vulnerabilities, etc.

Click Hijack, also known as the interface camouflage, mostly with transparent layer elements as a trap to achieve the purpose of attack

Dos attacks that service-side services are stopped (using access requests to overload resources, resource exhaustion to stop services, stopping services through attack security vulnerabilities)

Backdoor procedures, developer Debug programs, developers for their own interests implanted programs, etc.

HTTP Diagram Reading notes

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.