Remarks: Proxy-Connection in Http network protocol
When using Chrome developer tools to capture packets, the request header Proxy-Connection is often seen. I didn't know what would happen in the past, and I didn't know what it means. After reading chapter 4 "connection management" and chapter 6 "Proxy" in the HTTP authority guide, I finally figured out that this is because I set a Proxy for the browser ). The principle of packet capture by the Fiddler is to allow browser requests to go through its local proxy. Therefore, if Fiddler is enabled, this request header will inevitably be generated.
What has the proxy changed?
To thoroughly understand this problem, let's first look at the changes in HTTP request packets after browser proxy settings. The following are the request packets for proxy access to the same URL (irrelevant content is omitted ):
GET / HTTP/1.1Host: www.example.comConnection: keep-aliveGET http://www.example.com/ HTTP/1.1Host: www.example.comProxy-Connection: keep-alive
After the proxy is set, the browser connects to the proxy server instead of the target server. This change cannot be seen from the request message. There are two changes in the request message: the request-URL in the first line is changed to the complete path, and the Connection request header is replaced with the Proxy-Connection. Let's look at these two changes respectively.
Why is the complete path required?
In the early HTTP design, the browser directly talked to a single server without a VM. A single server always knows its host name and corresponding port. To avoid redundancy, the browser only needs to send the part of the URI other than the host name. After the proxy appears, some Uris are completely available, and the proxy server cannot know the host on which the user wants to access the URI. For this reason, HTTP/1.0 requires the browser to send a complete URI for the proxy request, that is, the browser must be told in a standard.
After the proxy is explicitly configured for the browser, the browser uses the complete URI for subsequent requests, solving the problem that the agent cannot locate the resource. However, the proxy can appear anywhere in the connection. Many proxies are invisible to the browser, such as reverse proxy or router proxy. In fact, almost all browsers add HOST request headers whose content is the HOST name for each request to completely solve the VM problem. For HTTP/1.1 requests, the HOST Request Header must exist; otherwise, the system will receive the 400 Error. For HTTP/1.0 requests, if the connection is a proxy server, the relative URI is used, if the HOST Request Header does not exist, an error occurs.
What is Proxy-Connection?
The Connection in HTTP is used to describe the HTTP Connection. Multiple instructions are separated by commas (,), for example:
GET / HTTP/1.1Host: www.example.comConnection: my-header, close, my-connectionMy-Header: xxx
"My-header" is the name of other headers in this request (Case Insensitive), indicating that this Header is only related to the current connection. In fact, the Connection itself is only related to the current Connection. When one or more intermediate entities (such as proxies) exist between the client and the server, each request message is sent to the server one by one starting from the client (usually a browser). The response message of the server, it will also return to the client one by one. Generally, even if many Proxies are passed, the request header will be unblocked and sent to the server, and the response header will be received by the client as it is. However, the Connection and other headers defined by the Connection only describe the Connection between the previous node and the current node, and must be deleted before the message is transferred to the next node, otherwise, it may cause issues to be mentioned later. Other headers that cannot be passed include Prxoy-Authenticate, Proxy-Connection, Transfer-Encoding, and Upgrade.
"Close" indicates that the current Connection needs to be closed after the operation is complete. connection also allows any string as its value, such as "my-Connection", to store custom connection descriptions. HTTP/1.0 does not support persistent connections by default. Many HTTP/1.0 browsers and servers use the "Keep-Alive" Custom description to negotiate persistent connections: the browser adds Connection in the Request Header: keep-Alive, the server returns the same content, and the connection will be maintained for future use. For HTTP/1.1, Connection: Keep-Alive has lost its meaning, because in addition to explicitly setting Connection as close, HTTP/1.1 is a persistent Connection by default.
With the above background knowledge, let's look at the problem. On the Internet, a large number of simple and outdated proxy servers continue to work, and they may not be able to understand the Connection, whether it is the Connection in the request message or response message. When the proxy server encounters an unknown Header, it usually chooses to continue forwarding. In most cases, this is true. Many application software that uses the HTTP protocol extends the HTTP header. If the proxy does not transmit the extended fields, the software will not work.
If the browser sends Connection: Keep-Alive to such a proxy, the result will become very complex. This Header will be transferred to the server by the proxy that does not understand it. If the server does not understand it, it will be fine. If it can understand it, it will be a complete cup. The server does not know that Keep-Alive is mistakenly forwarded by the proxy. It will think that the proxy wants to establish a persistent connection. After the server agrees, it will return a Keep-Alive. Similarly, the Keep-Alive in the response will be returned to the browser as is by the proxy, and the proxy will wait for the server to close the connection-in fact, the server has maintained the connection according to the Keep-Alive instructions, the connection is not closed even if the data is returned. On the other hand, after the browser receives the Keep-Alive, it will reuse the previous connection to send the remaining requests, but the agent does not think there will be other requests on the connection, and the request is ignored. In this way, the browser will remain suspended until the connection times out.
The most fundamental cause of this problem is that the proxy server forwards the prohibited forwarding Header. However, it is not easy to upgrade all old proxies. Therefore, the browser vendor negotiates a work und with the proxy Implementer: first, after explicitly setting proxy for the browser, the browser replaces the Connection in the request header with Proxy-Connetion. In this way, if an old proxy does not know this Header, it will continue to send it to the server, and the server does not know it, no persistent Connection is established between the proxy and the server (HTTP/1.0 proxies cannot be used to correctly handle the Connection), the server does not return Keep-Alive, and no persistent Connection is established between the proxy and the browser. For the new Proxy, it can understand Proxy-Connetion, replace meaningless Proxy-Connection with Connection, and send it to the server to receive the expected results.
Obviously, if the browser does not know that there is an old proxy in the connection, or if there is a new proxy on any side of the old proxy, this solution will not help. Therefore, sometimes the server will choose to completely ignore the Keep-Alive feature of HTTP/1.0: for HTTP/1.0 requests, never use persistent connections or return Keep-Alive.
Last
The above content shows that the browser modifies the proxy request header to be as compatible with various nonstandard transit devices in the network as much as possible to make the network more robust.
Finally, if you use Fiddler and other tools to view the same request header, you will find that Fiddler displays Connection, while other tools display Proxy-Connection. This is because in most cases, Fiddler will change the Proxy-Connection back to the Connection to display the difference.