HTTP Authoritative Guide 1~6 notes

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

A few days ago read this classic book, basically has been finished, there are many useful information, the following is the first six chapters of the notes: Chapter One

HTTP is the application layer protocol, TCP is applied in the transport layer, IP is in the network layer
TCP provides error-free data transfer, sequentially transmitted, non-segmented (can pass any length of data at any time)
An HTTP client needs to establish a TCP/IP connection with the address and port number before sending a message to the server
The hostname is the URL and can be converted to an IP address via DNS (domain Name service) with a default port number of 80
Steps for the browser to get HTML resources:

Parse out host name
DNS Conversion to IP address
Parse out port number
Establish a TCP connection
Sending HTTP request messages
Receiving HTTP response messages
Close the connection, display the document

The currently commonly used HTTP protocol is version 1.1, and Http-ng is also known as http/2.0, which greatly optimizes performance (tenth chapter)
Other more important applications in the Web:

Proxy: HTTP intermediate entity between client and Server (chapter sixth)
Cache: Save a copy of a frequently used page in a closer proximity to the client (seventh chapter)
Gateways: Special Web servers that connect to other applications (chapter eighth)
Tunneling: A special agent for blind forwarding of HTTP communication messages
Agent Agent: Semi-intelligent Web client that initiates automatic HTTP requests

Chapter II URLs and resources

URL is the resource location that the browser needs to look for information, the location of the resource to represent the resource
URI is a generic resource identifier, the URI consists of a URL and a urn, and the name identifies the resource, regardless of where it is located. The subset of knowledge URLs processed by
http, sometimes not differentiated, refers to URLs. The
URL is divided into the following three sections

http: is a URL scheme that tells Web clients how to access resources, stating that they want to use the HTTP protocol
URL, hostname: refers to the location of the server
after Path: Refers to the resource path, which is the specific resource on the server that the water is requesting.

URL Syntax: scheme://User: password @ Host: port/path; query # fragment

where parameters are required for compliance with the Protocol, used previously; split with path
query is sent to the server. The fragment is handled by the client on its own

relative URL:./: Indicates relative to scenario://hostname location
auto-extend URL

hostname extension: For example, enter Baidu to build a Www.baid u.com
History extension: Stores the URL history visited by a previous user and enters the URL with the URL in the history of the Golden Dome ratio

URL character set and encoding mechanism:

Character set: Using us- The ASCII character set, which uses a 7-bit binary code to be the most keys supplied by the English typewriter with a few non-printable control characters for text formatting and hardware notifications
encoding mechanism: In order to avoid the restriction of safe character set notation, the notation represents the unsafe character, which contains a percent percent , followed by two hexadecimal digits that represent the ASCII code of the character
character restriction: Some characters with special meanings are not recommended, such as%/,. #?;: $ + @ & = etc.

Some common scenarios: http H TTPs mailto FTP rtsp rtspu file news telnet
URL tells the location where the resource is located and is not valid if the resource is moved. The urn solves this problem.

Chapter III HTTP Messages

An HTTP message is a block of data that is sent between HTTP applications.

Terminology: Inflow and outflow to describe the direction of transaction processing. The message flows to the source-side server, which flows back to the user's agent when the work is completed
All messages flow downstream, and all messages are sent upstream of the recipient.

Components of the message:

HTTP messages are simple, formatted chunks of data, each containing a request from the client or a response from the server
They consist of three parts: the starting line that describes the message, the header block containing the attribute, and optionally the body part that contains the data

The syntax of the message:

All messages can be divided into two categories: Request message, Response message
Request Message Format:

Start line: Method request URL Version header block: ... 　　　　　　Entity Body part: ...　　　　　　3. Response message Format: Start line: Version status Code reason phrase header block: ... Entity Body part: ...

Start line: All HTTP messages start with a starting line, the start line of the request message indicates what to do, and the start line of the response message indicates what happened

The starting line of the request message is called the request line, and the method defined has a get POST DELETE HEAD PUT TRANCE Opinions and other common methods used to tell the server what to do
Status code: Tell the client what's going on

100~199: Defined 100~101, indicating informational hints
200~299: Defined 200~206, indicating success
300~399: Defined 300~305, indicating redirection
400~499, defined 400~415, indicating client error
500~599, 500~505 defined, indicating server error

Cause phrase: The status code is paired, and the reason phrase is a readable version of the status code
Version number, in the form of http/x.y, declares the highest supported version

Header: List of key-value pairs

General Header
Request Header: Provide request information
Response Header: Provide response information
Entity Header: Describes the length and content of the subject, or the resource itself
Extension header: New header not defined in specification

Entity body part, the entity's chirography is the load of HTTP message, is the content that needs to transmit, the type that can transmit: picture, video, HTML, software application, credit card affairs, email etc.
Method:

Security methods: The Get and head methods are called security methods, and HTTP requests that use both methods do not produce any action
GET: Typically used to request that a server send a resource
HEAD: Similar to get, but the server returns only the header in the response and does not return the body part of the entity

Knowledge of resources without the availability of voluntary access
See if an object exists by looking at the status code in the response
Test if the resource has been modified by looking at the header

PUT: Writes a document to the server. If some publishing systems allow users to create web interfaces, install them on the Web server using the Put method
POST: Originally used to enter data to the server; it is usually used to support HTML forms and to send the completed data to the server.
Trace: When a client request passes through an intermediary node, the original HTTP request may be modified, and the last station of the Trace method pops up a trace response that can be viewed in the body of the response with the original request message and all intermediate application user programs. Can be used to view the effect of intermediate programs on user requests
Options: Requesting the Web server to inform it of the various features it supports
Delete: Ask the server to delete the resource specified by the request URL, but the client application cannot guarantee that the delete operation must be performed
Extension methods

Status code:

1XX: Informational Status code: http/1.1 in the introduction of the Protocol

100,continue: Description received the initial part of the request
101,switching protocols: Indicates that the server is switching the protocol to the Protocol listed in the update header, as specified by the client

2XX: Success Status Code

200,ok: Indicates no problem with the request
201,created: User requests to create server objects
202,accepted: The request has been accepted, but the server has not performed any action on it
203, the Entity header contains information that is not from the source-side server, but from a copy of the resource
204,no Content: The response message contains several headers and a status line, but there is no body part of the entity
205,reset Content: The code used primarily for the browser, responsible for informing the browser to clear all form elements in the current page
206,partial Content: A partial or range request was successfully executed

3XX: Redirect Status code

300,multiple Choices: Client requests a URL that actually points to multiple resources
301,moved permanently: The request URL has been removed, the location header of the response contains the URL where the resource is now
302,found: Similar to 301, but the client should use the URL given by the location header to temporarily locate the resource, and future requests will still use the old URL
303,see Other: Informs the client that another URL should be used to obtain the resource
304,not Modified: Resource not modified, local resource is the latest version
305,use Proxy: Resources must be accessed through a proxy
307,temporary Redirect: Similar to 301

4XX: Client Error status code

400,bad Request: Informs the client that an error has been sent
401,unauthorized: Clients need to authenticate themselves before gaining access to resources
402,payment Required: The status code is reserved
403,forbidden: The request is rejected by the server, and the main part of the response describes the reason
404,not Found: The server could not find the requested URL
405,method not allowed: The requested URL does not support this method
406,not acceptable: The server does not have a resource that matches the URL of the client's first passenger
407,proxy authentication Required: Similar to 401, but requires authentication proxy server for resources
408,request Timeout: The client takes too long to complete the request
409,conflict: The request may cause some conflicts on the resource
410,gone: With 404, but the server used to have that resource
411,length Required: The server requires that the Content-lnegth header be included in the request message
412,precondition Failed, the client initiated the conditional request, but one of them failed.
413,request entity Too Large: The principal part sent by the client is too long
414,request URI Too Long: The request URL sent is too long
415,unsuuported Media Type: Unable to understand or support the entity content type sent by the client
416,request Range not satisfiable: The request message lock clear Autumn is a range of the specified resource, and the range is invalid or not satisfied
417,expectation Failed: The requested expect request header contains an expectation, but the server does not meet the expectation

5XX, server Error status code

500,internet Server Error: The server encountered a bug that prevented it from serving the request
501,not implemented: Client initiated requests that exceed the capabilities of the server
502,bad Gateway: A server that is used as a proxy or gateway receives a pseudo-response from the request response, such as the inability to connect to the parent gateway
503,service Unavailable: The server is temporarily unable to service a request
504,gateway Timeout: Similar to 408, except that the response here comes from a gateway or proxy
505,http version not supported: The server received a request that uses a protocol edition that it cannot or is unwilling to support

Header: Together with the method determines what the client and server can do

General Header: The most basic information related to a message

Connection: Allow client and server to specify the amount of the request/corresponding connection option
Date: Provides the dates and event flags that describe when the message was created
Mime-version: Gives the MIME version used by the sending side
Universal Cache Header: Cache-control is used to indicate with message delivery cache;
...

Request Header

Client-ip
From
Host
Referer
...
Accept header: Tell the server which media types, character sets, encoding methods, languages, etc. can be sent
Conditional Request Header: The client wants to add some restrictions to the request, Expect
Security Request Header: Challenge/Response authentication for request
Proxy request Header

Response Header: Provides some additional information to the client

Negotiation Header
Security Response Header

Entity Header: Provides a wealth of information about the entity and its content

Content Header: Provides specific information about the entity's content
Entity Cache Header: Describes how or when to cache

Extension header

Fourth Chapter Connection Management

TCP/IP is a common set of hierarchical protocols for packet-switched networks that are used globally by computers and network devices.
TCP provides a reliable bit transport pipeline for HTTP, and the bytes that are filled in from a TCP connection are correctly transmitted from the other end in the original order. TCP streaming is sent by a small block of data called an IP packet. When HTTP is transmitting a message, the contents of the message data are transmitted sequentially through an open TCP connection in the form of a stream. After TCP receives the data stream, the data stream is hacked into a small block of data called a segment, and the segment is encapsulated in an IP packet and transmitted over the Internet.

TCP is the port number to keep all of these links running correctly. While the TCP connection is identified by 4 values: Source IP address, source port number, destination IP address, destination port number, no two different connections all 4 values are the same.
Socket sockets:

Server-side:

Creating a Socket Create
Bind Port Bind
To monitor Listen
Wait for connection Accept
Reads the request and processes the Read
Callback Response Write
Close the connection close

Client

Creating a Socket Create
Connect to the server Ip:port on connect (this establishes a TCP connection for a road server)
Connection succeeded and send request write
Receive and process response read
Close the connection close

HTTP latency is mainly due to TCP network delay, the main reasons for the delay of HTTP transactions are as follows:

The client needs to determine the IP address and port number of the Web server based on the URI, and the DNS resolution system may take tens of seconds without caching.
The client sends a TCP connection request to the server and waits for a request to be answered by the server, and each TCP connection will have a delay of up to 2 seconds
After the connection is established, the client needs to send the request and the server needs to process the request. It takes time for the Internet to transmit the request message and the server to process the request message.
The Web server will return the HTTP response, which takes time
In addition, the size of these TCP network latencies depends on the deceleration, the network and server load, the size of the request and response messages, and the distance between the client and the server. Additional technical complexity of TCP protocol

TCP connection Handshake steps

To request a new TCP connection, the client wants the server to send a small TCP packet, and the SYN token is set in the packet, indicating that it is a connection request
The server accepts the request and processes it, callbacks a TCP packet, and the SYN and ACK tokens in this group are set to indicate that they have been accepted
The client callbacks a confirmation message to the server to notify the connection that it was successfully established

7. Delay Acknowledgement: Each TCP segment receives a good segment and sends a small acknowledgment packet back to the sender. If the sender does not receive it within the specified time, the packet is considered corrupted and the data is sent again.

8. Processing of HTTP connections---> Not read

Connection header field can host three different types of labels

The HTTP header field name, which lists only the headers related to this link
Any tag value that describes the non-standard options for this link
Value close, which indicates that the persistent connection needs to be closed after the operation is completed

Serial transaction processing time delay

Disadvantage: Some browsers cannot determine the size of an object until the object is loaded, and the location cannot be determined. So before loading enough objects, the screen is blank and the user experience is low.
Addressing the required technologies: Parallel connections (multiple TCP connections), persistent connections (reuse of TCP connections), piped connections (leveraging shared TCP connections), and multiplexed connections (still in the experimental phase)

9. Parallel Connection: HTTP allows clients to open multiple connections and execute multiple HTTP transactions in parallel

Parallel connections can increase the load speed of a page: The sending request transaction is overlapping, and the delay of the connection is overlapping.
Parallel connections are not necessarily faster: loading multiple objects in parallel can compete for bandwidth when the client's network bandwidth is insufficient, while a large number of connections consume a lot of memory resources
Parallel connections can make people "feel" faster

10. Persistent Connection: http/1.1 allows the HTTP device to keep the TCP connection open after the transaction has ended, and to reuse the existing connection over the future HTTP request. Non-persistent connections are closed after each transaction ends, and persistent connections remain open between different transactions until the client or server decides to close them. ...

11. pipelined Connection: Before the response arrives, you can put multiple requests into the queue, when the first request over the network to the server side, the second, the third request can also start sending, which can reduce the network loopback time, improve performance. Pipe connections have the following limitations:

If the HTTP client cannot confirm that the connection is persistent, the pipe should not be used
HTTP responses must be sent in the same order as the request
The HTTP client must be ready for the connection to close at any time and to re-send all outstanding pipelined requests
HTTP clients should not use pipelining to send requests that produce side effects, such as Post

12. Close the connection:

Content-length and truncation operations: if the actual length does not match the content-length, the receiver should question the correctness of the length; the cache proxy should not cache the response
If a transaction is executed once or multiple times and the result is the same, then the transaction is idempotent, such as Get/head/put/delete/trace/options. And the client should not pipe non-idempotent transactions, such as post, or it will cause some uncertain consequences, to send a non-idempotent transaction needs to wait for the corresponding state of the previous request.
The connection is closed normally, and the TCP connection is bidirectional. The socket Theft close () method to close the TCP connection will shut down both the input and output channels to completely shut down. Call shutdown () to shut down the input or output channel separately, which is called semi-shutdown. A simple HTTP application can only use a full shutdown. When a client or server needs to close a connection abruptly, it should "gracefully shut down the transport connection."
The shutdown input/output is for the server. The output channel of the closed connection is more secure, and the peer entity at the other end of the connection receives a notification after all the trees are read from the buffer, stating that the stream is over. It is dangerous to turn off the connected input channel and most operating systems will treat this as a serious error.

Fifth Web server All Web servers regardless of style, size, can accept requests for resources HTTP request, content back to the client

The actual web server will do the following things

Establish a connection: receive a client connection and close it if you do not want to establish a connection with the client
Accept request: Reads an HTTP request message from the network
Process request: Interpret the request message and take action
Accessing resources: Accessing the resources specified in the message
Build response: Create an HTTP response message with the correct header
Send response: Return the response to the client
Record transaction processing process

Accept client Connections

Processing a new connection: After the Web server receives a TCP connection from a client request, it determines which client is at the other end of the connection, resolves the IP address from the TCP connection, and after the connection is established, the server is ready to listen for data transfer on the new connection. The Web server is free to reject or immediately close any connection.
Client hostname Recognition: The Web server can be configured with reverse DNS to translate the client IP address into a client hostname, but many Web servers restrict that functionality.
Determine the client user through Ident: The server uses the Ident protocol to receive the user name of the client's new connection and resolves the client's response that contains the user name, and can work well within the organization, but there are many reasons not to work well on the public network

Receiving Request messages

Parse the contents of the request message: Parse the request line, find the request method, specify the URI with the version number, etc.; Read the request body, the length is specified by the Content-length header
Internal representation of the message: processing the request message, such as placing the header in a quick query table, to quickly access the specific value of a particular header.
Connected input/output processing structure: Because the request may arrive at any time, the Web server constantly observes that there are no new Web requests

Single-threaded Web server: Only one request is processed at a time, the next connection is processed, but other connections are ignored during processing, which can cause serious performance problems.
Multi-process and multi-threaded Web servers: can be created as needed or reserved for some threads/processes beforehand
Multiplexing I/O servers: In a multiplexed structure, you need to monitor the activity on all connections, and when the state changes, the link is processed in small amounts; after processing is complete, return to the open list and wait for the next state change.
Multiplexed multithreaded Web servers: Combining multithreading with multiplexing to take advantage of multiple CPUs on a computer platform

Processing requests: Once the Web server receives the request, it can process the request based on the method, resource, header, and optional body part
Mapping and access to resources: Web servers are resource servers responsible for sending pre-created content, such as HTML pages, JPEG images, and dynamic content generated by resource generators running on the server

The file system of the Docroot:web server will have a dedicated folder for Web content, called the root of the document, where the Docroot,web server obtains the URI from the request message and attaches it to the back of the document root directory. The Apache server can add documentroot/usr/local/httpd/files as the root directory in the httpd.conf file, but cannot leave the relative URL out of the docroot, such as http:// www.yf403.cn/. /Is not allowed

Virtual managed Docroot: You need to configure a virtualhost block for each virtual Web site, and each virtual server contains DocumentRoot
User's home directory Docroot

Directory list: The Web server can accept requests for a directory URL, and its path can be resolved to a directory. Most servers go back and look for index.html to indicate the default directory. Apache can be set DirectoryIndex to configure the default directory file used by the file name collection, you can use the Apache Directive "options-indexes" to prevent the automatic generation of directory index files
Mapping of dynamic Content resources: The Web server can also map URIs to dynamic resources and map to programs that dynamically generate content on demand.

Apache allows the user to map the URI pathname component to the executable directory, as the following directive indicates that all URIs that start with/cgi-bin/should be executed in the directory/usr/local/etc/httpd/cgi-programs/ Find the appropriate file: scriptalias/cgi-bin//usr/etc/httpd/cgi-programs/
Apache also allows users to use a special file extension to identify the executable file, in this way can put the executable script in any directory, the following Apache configuration Directive Cheuk Ming to execute all the. CGI end of the Web resources: AddHandler cgi-script . CGI
Modern application servers have a more powerful and efficient service-side dynamic content support mechanism, including Microsoft's ASP and Java Servlet

Server-side inclusion (SSI): If a resource is identified as having a server-side inclusion, the server processes the contents of the resource before sending it to the client, such as scanning the content for specific templates, which may be variable names or embedded scripts. This is one way to create dynamic content
Access control

Build response: When the server recognizes the resource, it executes the action and returns the response message

Response entity: Content-type describes the body MIME type; content-length describes the length of the response body; The subject content of the actual message
MIME type: Multiple methods to determine MIME type:

Suffix name
Magic Classification: Scans the contents of each resource and matches a known schema table to determine the MIME type
Display Classification: Configure the server, regardless of file extension and content, forcing a file or directory to use a type
Type negotiation: Configure the Web server to negotiate with the user to decide which format to use

Redirect: The return code is 3XX, and the location response header contains the new address of the content or the URI of the preferred address. Applicable situation:

Permanently deleted resources, 301, with a new URL that updates information such as bookmarks
Temporarily deleted resources, 303 and 307: Resources are temporarily removed or renamed, redirected to a new URL, because it is temporary, so bookmarks are not updated
URL enhancement: The server rewrites the URL with redirection, and the new URL contains the status information, 303/307
Load balancing: Overloaded servers redirect requests to a server that is not heavily loaded, 303/307
Server affinity: The server can redirect the client to the server that contains the client information, 303/307
Canonical directory name: The requested URI does not have a trailing slash, and most servers redirect the client to a slash

Send response: For a non-persistent connection, you need to close your end of the connection after sending the complete message, or for a persistent connection, the Content-length header needs to be properly computed, or the client will not know when the response is over.
Logging: Describing a transaction that has been performed in a log file

The sixth Chapter Proxy Web Proxy Server is the intermediary entity of the network, between the client and the server, sending HTTP messages back and forth between the endpoints. The Web Proxy is the server, and the client needs to accept the request message, return the response message, and play the role of the server, for the server, the proxy needs to send the Web request message, receive the Web Response message, and play the role of the client.

Private and shared proxies: a single client-only proxy is called a private agent, and agents shared by multiple clients are called public proxies

Public proxies: Most are this, easy to manage
Private agents, not very common

Comparison of proxies and gateways:

The agent is connected to two or more applications that use the same protocol
The gateway is connected to two or more endpoints using different protocols, acting as a "protocol translator."
In the actual process, the agent also often has to do some protocol conversion work.

Agent functions: Improve security, improve performance, save money; The proxy server can see and touch all the HTTP traffic that flows, so the agent can monitor traffic and modify it. Here are some examples of usages:

Child filter: Implements the function of the filter
Centralized document access control: Implement a unified access control policy between Web servers and Web resources, and create an audit trail mechanism
Security firewall: Limit which application-layer protocols on a single security node data can flow into or out of an organization, and can check traffic to eliminate viruses
Web caching: Maintains local copies of common documents and provides them on demand to reduce slow and expensive internet traffic
Reverse proxy: Proxies can impersonate a Web server, receive real requests to the Web server, and then initiate communications with other servers to locate the requested content on demand. Can be used to access public content on slow Web servers to improve performance. Known as Server accelerator
Content routers: You can request to a specific Web server based on the state of the Internet traffic and the content type
Transcoding: You can modify the content's principal format before sending the content to the client, and the transparent conversion between the representations of the data is called transcoding.
Anonymous: The identity feature information is automatically removed from the HTTP message, providing a high degree of privacy and anonymity.

Deployment of a proxy server

Export agent: Fixed to the local network exit point, control the flow
Access (ingress) Proxy: Placed on an ISP access point to process aggregate requests from clients
Reverse proxy: Deployed at the edge of the network, used as a Web server substitute, processing requests sent to the server, and, if necessary, requesting resources from the server to improve performance.
Network switching agent: To reduce the congestion of nodes by placing them on the Internet Peer exchange point between networks

Hierarchical structure of agents

Static proxy
Dynamic Proxy: Select parent Agent

Load Balancing
Routing near a geographic location
Protocol/Type Routing
Subscription-based routing

How to obtain the agent's traffic:

Modify client: Manual or automatic proxy settings for the browser
Modify the network: This interception typically relies on switching devices that monitor HTTP traffic, as well as routing devices, and import traffic to the proxy, called the "interception agent"
Modify the DNS namespace: A proxy server that is placed before the Web server and can manually edit the DNS list to determine the appropriate proxy or server
Modify the Web server: Configure some Web servers to send a 305HTTP redirection command to the client, redirecting client requests to a proxy

Proxy settings for clients (note: Network, DNS, server configuration in the 20th chapter)

Manual configuration: If you manually configure the agent in Internet Options for IE, only one proxy server can be set
Pre-configured server: Using PAC file, a small JS program, you can calculate proxy settings, configuration method and a similar
Proxy Auto-configuration: Provides a URI to the proxy auto-configuration file written by JS and runs to decide whether to use a proxy
Agent Discovery for WPAD (Web Proxy autodiscover protocol): automatically detects which configuration server the automatic configuration file should be downloaded from. The algorithm of the Protocol uses the discovery mechanism, and the step-up strategy automatically finds the appropriate PAC file for the browser. WPAD uses a variety of discovery techniques in the order of (DHCP, SLP, DNS-well-known hostname, DNS src record, DNS service URI in TXT record). The client that implements the WPAD protocol needs:

Use WPAD to find the URI of PAC
Gets the PAC file from the specified URI
Executing a PAC file to determine the proxy server
Using a proxy server for requests

Some questions about proxy requests

The proxy URI differs from the server URI, where the client sends a request to the Web server with only a partial URI (no scheme, host, or port), but when the client sends a request to the proxy, the request includes the full URI

This is thought to be inherent in the original HTTP design, the client back directly to a single server dialog, there is no virtual host, there is no rules for the agent, and a single server knows its host name and port
So we're going to send a partial URI to the server, send the full URI to the proxy, and when the client proxy is not set, a partial URI is sent, and the proxy is set to send the full URI

The same problem as the virtual host: "Scenario/HOST/port" missing

The displayed proxy requires that the full URI be used in the request message to resolve
The virtual host Web server requires host header for hosting and port information

The interception agent receives a partial URI: the client does not always know that it is talking to the agent, that the agent is not visible to the client, that the client traffic may pass through a substitute or an interception agent, and that the full URI is not sent in either case

The reverse proxy usually pretends to be a server hostname or IP address as a reverse proxy. The client cannot differentiate between the reverse proxy and the Web server, so it sends a partial URI
Intercept agent: Intercepts the request sent from the client to the server and forwards it. is subject to a partial URI that is sent to the Web server

The agent can handle either proxy requests or server requests

If the full URI is provided, then the proxy should use the full URI
If you provide a partial URI, and you have the host header, you should use the host header to determine the name and port number of the original server
If a part is provided and there is no host header, the original server should be determined in other ways

Modification of URIs during forwarding: ...
Client Auto-Extension and hostname resolution for URIs: Common users do not enter the prefix www or suffix. com case, the browser will automatically expand
Parsing of URIs when there is no proxy: input->dns Search host ' aaa ' and search failures--browser automatically expands to Www.aaa.com->DNS Search host ' www.aaa.com ', return IP address The browser tries to connect until the connection is successfully established
Parsing of Uri when proxy: Enter ' AAA '->dns to search Proxy server's address----Get proxy IP address, and browser to automatically expand www.aaa.com-> browser attempts to connect until successful
URI parsing with intercept proxy: No proxy for client, similar to H

Tracking messages

Via header: Lists information about each intermediary node of the message path, including protocol name (optional), protocol version, node name, comment (optional)
Via Request and Response path: The response path is the opposite of the request path
Via and Gateway: via header record protocol transitions within the gateway
Server and Via header
Via's privacy and security issues: the Via string should avoid using the exact hostname and port number, which may be exploited maliciously and can be compressed

Trace method: The user can track the request packet transmitted by the path agent chain, see which proxies have been passed, and how each agent requests the message to be modified, and can use the Max-forwards parameter to set the maximum number of forwards, when the maximum number of forwards is 0, The trace message is sent to the client, and each time it is forwarded, the parameter is reduced by one.
Agent authentication: Prevents access to the device's content requests until the user provides a valid certificate of permissions to the agent

Restricted content requests reach the proxy server, callback 407 status code, and the proxy Authorization header field that describes how to provide these certificates
When the client receives 407, it collects the required certificate from the local database or prompts the user
When a certificate is obtained, the client sends a new request and provides the required certificate in the header field
If the certificate is valid, the agent sends the request down, otherwise another 407 reply

Interoperability of agents

Processing code does not support headers and methods
Options method: Discover support for optional features clients can determine the capabilities of the server before interacting with the server
Allow Header: Lists the list of methods supported by the request URI, such as Allow:get, HEAD, PUT; the agent cannot modify the Allowed header field

HTTP Authoritative Guide 1~6 notes

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More