The first chapter, understanding the Web and Network Foundation 1.2 the birth of HTTP
HTTP was invented in 1990, when HTTP was not established as a formal standard, known as http/0.9
HTTP was officially published as a standard in May 1996, the version was named http/1.0, and the protocol is still widely used on the server side today.
http/1.1, released in January 1997, is the current mainstream version of the HTTP protocol.
Http/2.0 is in the making.
1.3 TCP/IP
TCP/IP is not a protocol, but the Internet-related various types of protocol family collectively. For details, see the TCP/IP detailed learning notes
The TCP/IP protocol families are hierarchically divided into the following 4 tiers:
Application layer: Determines the communication activity when an app service is provided to the user, which includes the FTP DNS HTTP
Transport layer: For the upper application layer, provides data transfer between two computers in a network connection, which includes TCP UDP
Network layer: Used to process packets flowing over the network, which specifies the path through which to reach the other computer and send the packets to each other.
Link layer: The hardware part used to process the connected network, the NIC Fiber
TCP/IP Communication transmission Flow (HTTP example)
First, the client on the sending side sends an HTTP request to see a Web page in the Application layer (HTTP protocol). Then, for the convenience of transmission, the data received from the Application layer (the HTTP request message) is segmented at the Transport Layer (TCP protocol) and forwarded to the network layer by marking the serial number and the port number on each message. The Network layer (IP protocol) is incremented as the MAC address of the communication destination and is forwarded to the link layer. In this way, the communication request to the network is complete.
1.4.1 IP protocol in charge of transmission
The role of IP protocol is to send various packets to each other
The IP address indicates the address to which the node is assigned, and the MAC address refers to the fixed address to which the network card belongs
1.4.2 TCP protocol to ensure reliability
The TCP protocol splits data in order to make it easier to reach big data, using three-time handshake strategies to ensure that the data is eventually delivered to the other
Three-time handshake strategy:
The sending side first sends a packet with a SYN (synchronize) flag to the other, and the receiving end receives a packet with the SYN/ACK flag to convey the acknowledgement. The last sender sends back a packet with an ACK (acknowledgement) flag that represents the end of the handshake.
1.5 DNS (domain name System) service that is responsible for name resolution
DNS provides parsing services between domain names to IP addresses
1.6 Responsibilities of various protocols in the HTTP protocol communication process
By process sequence are:
DNS service: Resolves the domain name entered by the user to an IP address
HTTP protocol: Generate HTTP request messages for the target Web server
TCP protocol: In order to facilitate the communication of the HTTP request message segmentation into a message segment, each message segment reliably passed to each other
IP protocol: Search for the other's address while transferring
TCP protocol: Receives the message segment from the other side and requests the message from the group by serial number
HTTP protocol: Processing of content requested by the Web server
1.7 Uri/url
URI (Uniform Resource Identifier) The location identifier of a resource for a protocol scheme
URL (Uniform Resource Locator) represents the specific location of Internet resources
Chapter II, simple HTTP protocol 2.2 to achieve communication through the exchange of requests and responses
The request message is composed of the request method, the request URI, the protocol version, the optional request header field, and the content entity.
The response message is basically a protocol version, a status code, a reason phrase to interpret the status code, an optional response header field, and an entity body composition.
2.3 HTTP is a protocol that does not save state
The HTTP protocol itself does not save the communication state between the request and the response
2.5 HTTP method for informing server intent
GET: Used to access a resource that has been identified by a URI
POST: The body used to transfer the entity
The difference between the Get method and the Post method:
Security
- The data of the GET request is appended to the URL (that is, the data is placed in the HTTP protocol header), to split the URL and transfer data, the parameters are connected with &, the parameters will be clear text on the URL, easy to be seen by others, URL information may also be recorded in the history record.
- The POST request is that the submitted data is placed in the package body of the HTTP packet.
Data length:
- The HTTP protocol does not restrict the length of the transmitted data and URLs, but there is a limit to the length of the URL for a particular browser and server, so for a get commit, the transmitted data is limited by the URL length;
- Because the post operation is not transmitted by the URL, the length of the data is not limited theoretically;
A GET request can be cached, the URL to the GET request can be saved as a browser bookmark, and the POST request cannot
Get is used to obtain data from the server, and post is used to pass data to the server.
The following other methods are not commonly used
PUT: Transfer file HEAD: Get message header Delete: Delete file optuions: Ask for supported methods track: Trace path Connect: Requires tunneling protocol to connect the proxy
2.7 Durable connections Save traffic
The benefit of persistent connections is that it reduces the additional overhead caused by the duplication and disconnection of TCP connections and reduces server-side load. All connections By default are persistent connections in http/1.1
Persistent connections enable most requests to be routed in a pipelined manner, which can simultaneously send multiple requests concurrently.
2.8 State Management Using cookies
Cookie technology controls the state of a client by writing cookie information in request and response messages
The cookie notifies the client to save the cookie based on a header field information called Set-cookie in the response message sent from the server. When the next client sends a request to the server, the client automatically adds the cookie value to the request message and sends it out. After the server receives the cookie sent by the client, it checks the connection request from which client, then compares the records on the server and finally obtains the previous status information.
Chapter III, HTTP messages in HTTP Messages 3.1 HTTP messages
The HTTP message can be roughly divided into the message header and the message body two pieces, both have the initial appearance of the empty line to divide, usually does not have to have the message body
3.2 Structure of request message and response message
Request Line: Contains the method used for the request, the request URI, and the HTTP version
Status line: Contains status codes for response results, reason phrases, and HTTP versions
Header field: Contains the various types of headers that represent the different conditions and attributes of the request and response, typically: general header, request header, response header, and entity header.
Other: Undefined header (cookie, etc.) in RFC containing HTTP
3.3 Encoding increases transfer rate
There are several common ways to encode content:
- Gzip (GNU Zip)
- Compress (standard compression of UNIX systems)
- Deflate (zlib)
- Identity (not coded)
3.5 Getting range requests for partial content (range request)
When a range request is executed, the header field range is used to specify the byte range of the resource
For range requests, the response returns a response message with a status code of 206 Partial content
3.6 Content Negotiation Returns the most appropriate content
The content negotiation mechanism refers to the client and server to negotiate the content of the response, and then provides the most suitable resources for the client. Content negotiation is a benchmark for judging the language, character set, encoding, etc. of the resource in response to it. such as Accept, Accept-charset, accept-enoding, Accept-language, Content-language in the header field
Fourth, the HTTP status of the returned result 4.1 status code tells the result of the request returned from the server side
|
category |
Cool |
1XX |
Informational (Informational status code) |
The received request is being processed |
2XX |
Success (Success status code) |
Request normal processing complete |
3XX |
Redirection (Redirected status code) |
Requires an additional operation completed request |
4XX |
Client error (Customer fault status code) |
The server cannot process the request |
5XX |
Server error (server-side fault status code) |
Server Processing Request Error |
4.2 2XX Success
A $ OK indicates that a request from the client is handled properly on the server side
204 No Content indicates that the request received by the server has been processed successfully, but the returned response message does not allow the return of the body part of any entity
206 Partial Content indicates that the client made a scope request, and the server successfully executed this part of the GET request
4.3 3XX redirect
304 Not Modified indicates that the client has not changed the resources it accesses (since the last time it was accessed or according to the requested condition) when sending a GET request with a condition
4.4 4XX Client Error
401 Bad Request indicates that there is a syntax error in the message
403 Forbidden indicates that access to the requested resource was rejected by the server
404 Not Found Indicates that the requested resource could not be found on the server
4.5 5XX Server Error
501 Internet Sever error indicates server side errors occurred while executing the request
503 Service unavailable indicates that the server is temporarily overloaded or is being shut down for maintenance and is now unable to process requests
Fifth, Web server with HTTP Collaboration 5.1 implementing multiple domain names with a single virtual host
Under the same IP address, the host name or the URI of the domain name must be fully specified in the host header when sending an HTTP request because the virtual host can store multiple Web sites with different hostname and domain names.
5.2.1 Agent
The basic behavior of the proxy server is forwarded to the other server after receiving the request from the client, the proxy does not change the request URI, and the forwarding needs to append the host information that has been flagged by the Via header field.
Reasons to use a proxy server:
- Reduce network bandwidth traffic with caching technology (proxy caching)
- The primary purpose of an organization's internal access control for specific Web sites to obtain access logs.
5.2.2 Gateway
Gateways enable non-HTTP services for servers on communication lines (SQL data query)
5.2.3 Tunnel
The purpose of the tunnel is to ensure that the client can communicate securely with the server
Sixth, HTTP header 6.2.4 List of first fields
Generic header Field
header Field name |
Description |
Cache-control |
Controlling the behavior of the cache |
Cache-control |
Skip header, connection management |
Date |
Date and time when the message was created |
Pragma |
Message Instructions |
Trailer |
List of headers at the end of the message |
Transfer-encoding |
Specify the transmission encoding method of the message body |
Upgrade |
Upgrade to another protocol |
Via |
Information about the proxy server |
Warning |
Error notification |
Request Header Field
header Field name |
Description |
Accept |
Media types that the user agent can handle |
Accept-charset |
Preferred Character Set |
Accept-encoding |
Priority content Encoding |
Accept-language |
Preferred language |
Authorization |
Web authentication Information |
Expect |
Expecting specific behavior of the server |
Form |
User's e-mail address |
Host |
The server on which the resource is requested |
If-match |
Compare entity Tag ETag |
If-none-math |
Compare entity Tags |
If-modified-since |
Compare update times for resources |
If-unmodified-since |
Compare update times for resources |
If-range |
A range request that sends an entity byte when the resource is not updated |
Max-forwards |
Maximum transmission hop-on count |
Proy-authorization |
Client authentication information required by the proxy server |
Referer |
The original acquiring party for the URI in the request |
Range |
BYTE range request for entity |
TE |
Priority of transfer encoding |
User-agent |
Information for HTTP client programs |
Response Header Field
header Field name |
Description |
Accept-range |
Whether to accept byte range requests |
Age |
Estimating Resource creation Elapsed time |
ETag |
Matching Information for resources |
Location |
Enables client redirection of the specified URI |
Proxy-authenticate |
Proxy Server authentication information for the client |
Petry-after |
The time required to initiate the request again |
Server |
Installation information for HTTP server |
Vary |
Management information for Proxy server caching |
Www-authenticate |
Server-to-client authentication information |
Entity header Field
Seventh chapter, ensure the Https7.1http disadvantage of web security
- Communication using plaintext, content may be tapped
- May encounter a disguise without verifying the identity of the communication party
- Failure to prove the integrity of the message may have been tampered with
7.2 http+ Encryption + authentication + integrity Protection =https
HTTPS is not the application layer of a new protocol, but the HTTP communication interface part with the SSL and TLS protocol instead, in fact, the SSL protocol layer of the shell of the HTTP
SSL uses a cryptographic processing method called Public key encryption
Public key encryption uses a pair of asymmetric keys, private keys and public keys, the party sending the cipher uses the other's public key for encryption processing, the other party receives the encrypted information, and then use their own private key to decrypt
Public key encryption is more complex than shared key encryption, so it is inefficient to use public key encryption when communicating. So HTTPS uses a hybrid encryption mechanism with both shared key encryption and public key encryption
The principle of hybrid encryption mechanism:
- Secure Exchange shared key using public key encryption (the key to be used in a later shared key encryption)
- Ensure that the secret key of the exchange is secured by using a shared key encryption method to communicate.
The eighth chapter, the authentication of authenticated access User Nineth chapter, HTTP-based feature append Protocol 9.2 eliminate HTTP bottleneck spdy
HTTP bottleneck:
- Only one request can be sent on a connection
- The request can only be initiated from the client and the client cannot receive instructions other than the response
- The request/response header is sent without compression, the more the header information the greater the delay
- Send lengthy headers, each sending the same header resulting in more waste
- Can choose the data compression format, non-forced compression send
Try to eliminate the HTTP bottleneck method:
- Ajax can get content from the server in real time, which can result in a large number of requests
- Comet content can be updated in real time, but in order to preserve the response, the duration of one connection becomes longer and consumes more resources
- SPDY can effectively eliminate bottlenecks, but when a Web site uses resources under multiple domains, the effect is limited
9.3 WebSocket with a browser for dual-work communication
Main features of WebSocket protocol:
- Supports push data functionality from the server to the client
- Reduce the amount of traffic, not only the total cost of each connection is reduced, and the header information is rarely
In order to implement WebSocket communication, the Upgrade header field value of HTTP is set to WebSocket, and for the previous handshake request, a status Code 101 switching Proticols. After the successful handshake establishes the WebSocket connection, the HTTP data frame is no longer used, and the WebSocket independent data frame
WebSocket API
JavaScript can invoke the Websocket program interface provided in the "Websocket API" to enable dual communication under Websocket protocol
The tenth chapter, the technology of building Web content 11th chapter, the attack technology of the Web
"Graphic http"