1. Introduction to HTTP
The HTTP protocol (hypertext Transfer Protocol, Hypertext Transfer Protocol) is the transfer protocol used to transfer the text from the WWW server to the local browser. It can make the browser more efficient and reduce the network transmission. It not only ensures that the computer transmits hypertext documents correctly and quickly, but also determines which part of the document is being transmitted, and which content is displayed first (such as text before graphics), and so on.
Before we know how HTTP works, let's look at the communication between computers.
2. Computer communication with each other
The key technology of Internet is TCP/IP protocol. The communication between the two computers is carried out over the Internet via the TCP/IP protocol. In fact, this is a two protocol:
Tcp:transmission Control Protocol Transmission protocol and Ip:internet Protocol Internet Protocol.
IP: Communication between Computers
The IP protocol is a mechanism used by computers to identify each other, and each computer has an IP. Used to identify this computer on the Internet. IP is responsible for sending and receiving packets on the Internet. by IP, messages (or other data) are split into small, separate packages that are transmitted between computers via the Internet. The IP is responsible for routing each packet to its destination.
The IP protocol only allows computers to send messages to each other, but it does not check whether messages arrive in the order they were sent and are not corrupted (only critical header data is checked). In order to provide the message inspection function, TCP Transmission Control protocol is designed directly on the IP protocol.
TCP: Communication between Applications
TCP ensures that the packets arrive in the correct order, and tries to confirm that the contents of the packet have not changed. TCP is a port on top of an IP address that allows a computer to provide a variety of services over the network. Some port numbers are reserved for different services, and these port numbers are well known.
Service or daemon: on the machine that provides the service, a program listens for traffic on a particular port. For example, most e-mail traffic flows out on port 25, and HTTP traffic for WWWW is now on port 80.
When an application wants to communicate with another application over TCP, it sends a communication request. This request must be sent to an exact address. After the two parties "handshake", TCP will establish a full-duplex (Full-duplex) communication between the two applications, taking up the entire communication line between the two computers. TCP is used for data transfer control from the application to the network. TCP is responsible for splitting them into IP packets before the data is delivered, and then reorganizing them when they arrive.
TCP/IP is a two protocol that works together and has the upper and lower level relationships.
TCP is responsible for the communication between the application software (such as your browser) and the network software. IP is responsible for communication between computers. TCP is responsible for splitting the data and loading the IP packets, the IP is responsible for sending the packet to the recipient, the transfer process through the IP router is responsible for the traffic, network errors or other parameters to properly address, and then when they arrive to regroup them.
3. Protocol layer where the HTTP protocol resides
HTTP is based on the TCP protocol. protocols for each layer of the TCP/IP protocol Reference Model, where HTTP is the protocol for the application layer.
4. HTTP Request Response Model
HTTP is a standard client server model (b/s), consisting of requests and responses. The HTTP protocol is always a client-initiated request, and the server echoes the response. See:
HTTP is a stateless protocol. Stateless refers to the absence of a persistent connection between the client (Web browser) and the server, which means that when a client makes a request to the server and then the server returns a response (response), the connection is closed and the connection information is not maintained on the server side. HTTP follows the request/answer (Response) model. The client (browser) sends a request to the server, which processes the request and returns the appropriate answer. All HTTP connections are constructed as a set of requests and responses.
5. HTTP working process
An HTTP operation is called a transaction, and the whole process is as follows:
1), address resolution,
If you request this page using a client browser: http://localhost.com:8080/index.htm
From the decomposition of the protocol name, host name, port, object path, and so on, for our address, the result of the resolution are as follows:
Protocol Name: HTTP
Host Name: localhost.com
Port: 8080
Object path:/index.htm
In this step, domain Name System DNS is required to resolve the domain name localhost.com, the IP address of the host.
2), encapsulating HTTP request Packets
The above part is combined with the information of the machine itself, encapsulated into an HTTP request packet
3) encapsulated into TCP packet, establish TCP connection (three handshake of TCP)
Before HTTP work begins, the client (Web browser) first connects to the server over the network, which is done through TCP, which works with the IP protocol to build the Internet, known as the TCP/IP protocol family, so the internet is also known as tcp/ IP network. HTTP is a higher level of application-level protocol than TCP, according to the rules, only the lower layer protocol is established before the protocol can be more connected, so the first to establish a TCP connection, the port number of the general TCP connection is 80. This is port 8080.
4) client sends request command
After the connection is established, the client sends a request to the server in the form of a Uniform Resource Identifier (URL), a protocol version number, followed by MIME information including the request modifier, client information, and content.
5) Server Response
After the server receives the request, it gives the corresponding response information in the form of a status line, including the protocol version number of the information, a successful or incorrect code, followed by MIME information including server information, entity information, and possible content.
The entity message is that after the server sends the header information to the browser, it sends a blank line to indicate that the header information is sent to this end, and then it sends the actual data requested by the user in the format described in the Content-type reply header information
6) The server shuts down the TCP connection
In general, once the Web server sends the request data to the browser, it closes the TCP connection and then if the browser or server joins this line of code in its header information
Connection:keep-alive
The TCP connection remains open after it is sent, so the browser can continue to send requests through the same connection. Maintaining a connection saves the time it takes to establish a new connection for each request and also saves network bandwidth.
6. Each layer of data flow in the HTTP protocol stack
First we look at the client request when the data in each layer of the protocol data is organized such as:
The server parsing the client request is the process of reversing the operation, such as:
When a client initiates a request:
The customer has the opportunity to encapsulate the request into an HTTP packet--encapsulated as a TCP packet--encapsulated into an IP packet---> encapsulated into a data frame---> Hardware converts the frame data into a bit stream (binary data)-- Finally, the physical hardware (network card chip) is sent to the specified location.
The server hardware first receives a bit stream .... It is then converted into an IP packet. The IP packet is then resolved through the IP protocol, and then the TCP packet is found, then the TCP packet is resolved through the TCP protocol, and then the HTTP packet is found to be the HTTP packet to parse the HTTP packets to get the data.
6. Principle of HTTPS implementation
HTTPS (full name: Hypertext Transfer Protocol over secure Socket Layer) is a security-targeted HTTP channel and is simply a secure version of HTTP. That is, the SSL layer is added under HTTP, and the security base of HTTPS is SSL. The port number it uses is 443.
SSL: Secure Sockets Layer is a secure transport protocol designed primarily for the web by the Netscape company. This kind of protocol has been widely used on the web. Certificate authentication ensures that the communication data between the client and the Web server is encrypted and secure.
There are two basic types of encryption and decryption algorithms:
1) symmetric encryption (SYMMETRCIC encryption): The key is only one, encryption and decryption for the same password, and the decryption speed is fast, the typical symmetric encryption algorithm has DES, aes,rc5,3des and so on;
The main problem with symmetric cryptography is that the shared secret key, except that your computer (the client) knows the private key of another computer (the server), cannot encrypt and decrypt the communication stream. The solution to this problem is an asymmetric secret key.
2) Asymmetric encryption: Use two keys: Public key and private secret key. The private key is saved by a one-party password (typically the server is saved), and the public key is available to anyone on the other side.
This key in pairs appear (and according to the public key can not infer the private key, according to the private key can not infer the public key), encryption and decryption using a different key (public key encryption requires private key decryption, private key encryption requires public key decryption), relatively symmetric encryption speed is slow, the typical asymmetric encryption algorithm has RSA, DSA and so on.
Let's take a look at the HTTPS communication process:
The process is roughly as follows: 1) After the SSL client establishes a connection through TCP and the server (443 port), it requests the certificate during the general TCP connection negotiation (handshake) process. That is, the client sends a message to the server, which contains its own list of achievable algorithms and other required messages, the server side of SSL responds to a packet, which determines the algorithm required for this communication, and then the server returns the certificate to the client. (The certificate contains the server information: domain name.) The company that applied for the certificate, the public key). 2) After receiving the certificate returned by the server, the client determines the public issuing authority that issued the certificate, and uses the public key of the institution to confirm that the signature is valid, and also ensures that the domain name listed in the certificate is the domain name it is connecting to. 3) If you confirm that the certificate is valid, generate a symmetric key and use the server's public key to encrypt it. It is then sent to the server, which decrypts it with its private key so that both computers can begin to communicate with symmetric encryption.
Advantages of HTTPS communication:
1) The client-generated key can be obtained only by the client and the server;
2) encrypted data can only be plaintext by client and server;
3) client-to-server communication is secure.
7. http various length restrictions
1. The URL length limit in the Http1.1 protocol does not propose a limit to the length of the URL, as described in the RFC protocol, the HTTP protocol does not impose any restrictions on the length of the URI, the server side must be able to handle any of the services they provide a more acceptable URI, and can handle an infinite length of the URI , a 414 status code should be returned if the server cannot handle a long URI. Although the HTTP protocol specifies, Web servers and browsers have their own length limits on URIs. Server Restrictions: I touch the most server type is Nginx and Tomcat, for the length limit of the URL, they are controlled by the length of the HTTP request header to limit, nginx configuration parameter is Large_client_header_ Buffers,tomcat's request configuration parameter is Maxhttpheadersize, can be set on their own. Browser Restrictions: Each browser also has a limit on the length of the URL, the following are the URL length restrictions for several common browsers: (in characters) Ie:2803firefox:65536chrome:8182safari:80000opera :190000 for GET requests, there is no limit to the number of parameters requested within the length limit of the URL. 2. The length limit of the post data is similar to the URL length limit for post data, and there is no length limit in the HTTP protocol, and the length limit can be configured on the server side to configure the maximum HTTP request header length. 3. The length limit of a cookie is summed up in so many ways. (1) The maximum number of cookies allowed by the browser for each domain, not to go to their own test, the information found from the Internet is probably the case of IE: originally 20, and later upgraded to 50 firefox:50 opera:30 a Chrome : 180 Safari: Unlimited Browser behavior when the number of cookies exceeds the limit: IE and opera use the LRU algorithm to erase old, infrequently used cookies, and Firefox's behavior is to randomly kick out the values of certain cookies. Of course, no matter what the strategy, try not to let the number of cookies exceed the scope allowed by the browser. (2) The maximum length of each cookie allowed by the browser is firefox and safari:4079 bytes opera:4096 bytes ie:4095 bytes (3) The limit of the length of the HTTP request header in the server. Cookies will beAttached to each HTTP request header is passed to the server, and therefore also affected by the length of the server request header. 4. HTML5 LOCALSTORAGEHTML5 provides a local storage mechanism for Web applications to store data on the client, although this is not part of the HTTP protocol, but with the popularity of HTML5, we may need to use localstorage more and more, Even when it's popular, it's about as much as we do with cookies today. The length limit for localstorage, similar to the restriction of the cookie, is also restricted by the browser to the domain, except that the cookie limit is the number, localstorage limit is the length: firefox\chrome\ Opera is allowed to have a maximum length of 5MB per domain but this time the IE is more generous, the maximum allowable length is 10MB
Detailed HTTP Principles