Previous words
A URL often needs to represent a number of different resources. For example, a site site that needs to provide its content in multiple languages. If a site has two user-speaking and English speakers, it may want to provide site site information in both languages. Ideally, the server should send an English version to the English-speaking user and send a French version to the French user--users can access the content of the corresponding language by visiting the homepage of the website.
HTTP provides a content negotiation method that allows the client and server to make such decisions. With these methods, a single URL can represent different resources (for example, the French and English versions of the same Web page), which are called variants. This article describes the content negotiation in detail
Umbrella
For a particular URL, the server can decide which content to send to the client based on some principles. In some cases, the server can even automatically generate customized pages. For example, the server can convert HTML pages to WML pages for handheld devices. This kind of dynamic content transformation is called transcoding. These transformation actions are the result of content negotiation between the HTTP client and the server
There are 3 different ways to determine which page on the server is best for the client: Let the client choose, the server automatically determines, or let the intermediary agent choose. These 3 technologies are called client-driven negotiation, server-driven negotiation, and transparent negotiation, respectively.
Client-side Driver
For a server, it is easiest to send a response when a client request is received, listing the available pages and letting the client decide which one to look at. Obviously, this is the easiest way for the server to be implemented, and the client is likely to choose the best version (as long as the list has enough information for the client to choose from). The disadvantage is that each page requires two requests: the first time to get the list, and the second time to get the selected copy. The technology is slow and the process is boring, annoying users
In principle, the server actually has two options for the client: one is to send back an HTML document with links to various versions of the page and a description of each version, and the other is to send back the http/1.1 response using a multiple The choices response code. When the client browser receives this response, in the former case, a page with a link appears, in the latter case, a dialog window may pop up, allowing the user to make a selection. Anyway, the decision was made by the client's browser user
In addition to increasing the delay and making tedious requests for each page, this approach has one drawback: it requires multiple URLs: one for the public page, and one for each special page. So, for example, the original request address is Www.joes-hardware.com,Joe the server may reply to a page, the page contains links to Www.joes-hardware.com/english and www.joes-hardware.com/french. If the client wants to bookmark, do you want to add it to the original public page, or add it to the selected page? If users want to recommend this site to his friends, is to inform www.joes-hardware.com this address good, or only tell their English-speaking friends www.joes-hardware.com/english this address?
Server-driven
One way to reduce the amount of extra traffic is to have the server decide which page to send back, but to do this, the client must send enough information about the customer's preferences so that the server can make accurate decisions. The server obtains this information through the header set of the client request
The following two mechanisms are available for the HTTP server to evaluate what response is sent to the client to be more appropriate
1. The first set of inspection content negotiation. The server looks at the set of accept headers sent by the client and tries to match it with the corresponding response header
2, according to other (non-content negotiation) first to make modifications. For example, the server can send a response based on the User-agent header sent by the client
"Content Negotiation Header Set"
The client can send the user's preference information using the HTTP header set listed below
The first describes what type of media the accept tells the server to send-Language tells the server what language to send-Charset informs What character set is sent by the server accept-encoding tell the server which encoding to use
[note] These headers are very similar to the entity header. However, the use of these two headers is very different. Entity header sets like transport labels, which describe the various message principal properties that are necessary to transfer messages from the server to the client. The content negotiation header set is sent by the client to the server for exchanging preference information so that the server can select the one that best matches the client's preference from different versions of the document to provide services
The server uses the entity header set listed below to match the client's accept header set
Accept Header Entity Header Accept Content -typeaccept-language content-languageaccept-charset content-typeaccept-encoding content-encoding
Because HTTP is a stateless protocol, the server does not track client preferences between different requests, so the client must send its preference information in each request
If two clients send the Accept-language header, describing the language information they are interested in, the server can decide which version of the www.joes-hardware.com to send to which client. Allows the server to automatically select the documents sent back, reducing the time delay of the round-trip communication, which is unavoidable in the client-driven model
However, suppose a client prefers Spanish, which version of the page should the server echo back? English or French? The server has only two options: guessing or fallback to the client-driven model and asking the client which one to choose. If this Spaniard happens to know a little English, he may choose the English page, which is not ideal, but it can solve the problem. In this case, the Spaniard needs to have a way of conveying more information about its preferences, that is, he does speak English knows, and in the absence of Spanish, English is fine.
Fortunately, HTTP provides a mechanism that allows clients to describe their preferences in more detail than a client with a similar situation in the Spaniard. This mechanism is the mass value (q value)
The quality values are defined in the HTTP protocol, allowing the client to list multiple options for each preference category and to associate a priority with each preference option. For example, the client can send the following form of the Accept-language header:
Accept-language:en; q=0.5, fr; q=0.0 , NL; q=1.0, tr; q=0.0
where Q values range from 0.0-1.0 (0.0 is the lowest priority, and 1.0 is the highest priority). The header listed above indicates that the client is most willing to receive the Dutch (abbreviated NL) document, but the English (abbreviated as EN) document is OK; In any case, the client is unwilling to receive a version of French (abbreviated FR) or Turkish (abbreviated as TR)
[note] The order of preference is not important, only the Q value related to preference is important.
The server occasionally encounters a situation where the document cannot be found to match any preferences of the client. In this case, the server can modify the document, which is to transcode the document to match the client's preferences
"Other header Set"
The server can also match responses based on other client request header sets, such as the user-agent header. For example, the server knows that the old version of the browser does not support the JavaScript language so that it can send a page version that does not contain JavaScript
In this case, there is no Q-value mechanism to find a "recent-like" match. The server either goes for an exact match, or simply gives anything, depending on the implementation of the server
Because the cache needs to do its best to provide the correct "best" version of the cached document, the HTTP protocol defines the vary header that the server sends in response. This first tells the cache, as well as the client and all downstream agents, according to which header the server determines the best version to send the response.
"Apache"
Here's a summary of how the famous Web server Apache supports content negotiation. The content providers of the site, such as Joe, are responsible for providing different versions of Joe's index pages. Joe must also place these index page files in the appropriate directory of the Apache server associated with the site. You can enable content negotiation in one of the following two ways
1. In the site directory, create a type-map (type map) file for each URI in the site that has a variant. This type-map file lists each variant and its associated content negotiation header set
2. Enable the MultiViews directive, which will enable Apache to automatically create type-map files for the directory
"Using Type-map Files"
The Apache server needs to know the naming rules for type-map files. You can set handler in the server's configuration file to describe the suffix name of the Type-map file. For example:
AddHandler Type-map. var
This line shows that the suffix is. var file is Type-map file
An example of a Type-map file is given below
According to this type-map file, the Apache server knows to send joes-hardware.en.html to the client requesting the English version, sending joes-hardware.fr.de.html to the client requesting the French version. The Apache server also supports quality values
"Using MultiView"
In order to use MultiView, you must enable it by using the option directive in the appropriate subsection (<Directory>, <location>, or <Files>) in the access.conf file in the site directory.
If MultiView is enabled and the browser requests a resource named Joes-hardware, the server will look for files with Joes-hardware in all the names and create type-map files for them. The server guesses its corresponding content negotiation header set based on its name. For example, the French version of Joes-hardware should contain. fr
Another way to implement content negotiation on the server side is to use server-side extensions, such as Microsoft's dynamic Server Pages (Microsoft's Active servers pages, ASP)
Transparent negotiation
The transparent negotiation mechanism attempts to remove the load required for server-driven negotiation from the server and use an intermediary agent to represent the client to minimize the message exchange with the client. Assume that the agent knows what the client expects, so that it can negotiate with the server on behalf of the client, and when the client requests the content, the agent has received the client's expected
In order to support transparent content negotiation, the server must have the ability to inform the agent that the server needs to check which request headers to best match the client's request. No transparent negotiation mechanism is defined in the http/1.1 specification, but the vary header is defined. The server sends the vary header in the response to tell the intermediate node which request header to use for content negotiation
The proxy cache can save a different copy of a document that is accessed through a single URL. If the server passes their decision-making process to the cache, these proxies can negotiate with the client on behalf of the server. Caching is also a good place to transcode content, because a universal transcoding device deployed in the cache can transcode any server, not just a single server.
"Cache vs. Standby candidate"
Caching content is the assumption that content can be reused later. However, to ensure that the correct cached response is echoed back to the client request, the cache must apply most of the decision logic that the server uses to echo the response
It describes the set of accept headers sent by the client, and the corresponding entity header sets that the server uses to match the header sets in order to select the best response for each request. The cache must also use the same header set to determine which cached response to Echo
Shows the correct and incorrect sequence of operations involving the cache. The cache forwards the first request to the server and stores its response. For a second request, the cache finds a matching document based on the URL. However, this document is in the French version, and the requestor wants the Spanish version. If the cache only sends the French version of the document to the requestor, it makes a mistake.
Therefore, the cache should also forward the second request to the server and save the response to the URL and the "standby candidate" response. The cache now saves two different documents of the same URL, as on the server. These different versions are known as Variant (variant) or alternate candidates (alternate). Content negotiation can be seen as the process of selecting the most appropriate variant for a client request
"Vary Header"
Here are some typical requests and response headers sent by the browser and the server
However, if the server's decision is not based on the accept header set, but for example user-agent header, what will happen? It's not as extreme as it sounds. For example, the server might know that older browsers do not support the JavaScript language and therefore may echo page versions that do not contain JavaScript. If the server decides which page to send based on other headers, the cache must know what these headers are, so that the same logical judgment can be made when selecting the loopback page.
All client request headers are listed in the HTTP vary response header, which the server can use to select documents or produce custom content (outside of the general content negotiation header set). For example, if the document provided depends on the user-agent header, the vary header must contain user-agent
When a new request arrives, the cache looks for the best match based on the content negotiation header set. However, before providing the document to the client, it must check whether the server has sent the vary header in the cached response. If there is a vary header, the values in the new request must be the same as the corresponding header in the old cached request. Because the server may change the response based on the header of the client request, in order to implement transparent negotiation, the cache must save the client request header and the corresponding server response header for each cached variant, see
If the vary header of a server looks like this, a large number of different user-agent and cookie values will produce very many variants:
Vary:user-agent, cookies
The cache must save its corresponding document version for each variant. When the cache performs a search, the content negotiation header set is first matched to the content, and then the variation of the request is compared to the cached variant. If it doesn't match, the cache gets the document from the original server
transcoding
We have discussed a mechanism that allows clients and servers to pick out the most appropriate documentation for a client from a series of documents in a URL. The premise of implementing these mechanisms is that there are a number of documents that meet the client's needs-whether they are fully satisfied or to some extent satisfied
However, what happens if the server doesn't have a document that meets the client's needs? The server can give an error response. In theory, however, the server can convert existing documents into some kind of client-usable document. This option is referred to as transcoding
The following is a list of some hypothetical transcoding
Convert HTML document after conversion WML document High-resolution image Low-resolution image color image black-and -white image There are multiple frames of complex pages without many frames or graphs Like the simple text page has the Java applet HTML page without the Java applet HTML page has the ad page to remove the ad page
There are 3 different types of transcoding: format conversion, information synthesis, and content injection
"Format Conversion"
Format conversion refers to converting data from one format to another so that it can be viewed by clients. With HTML-to-WML transformations, wireless devices can access documents that are typically viewed by desktop clients. Clients that access Web pages over a slow connection do not need to receive high-resolution images, and if you reduce the image file size by reducing the image resolution and color by format conversion, it is easier for such clients to look at the richer pages of the image.
Format conversions can be driven by the content negotiation header set, but can also be driven by the user-agent header. Note that content conversion or transcoding is different from content encoding or transfer encoding, which is typically used to transfer content more efficiently or securely, while the first two enable access devices to look at content
"Information Synthesis"
Extracting critical pieces of information from a document is called Information Synthesis (information synthesis), which is a useful transcoding operation. Examples of such operations include creating an outline of a document based on a section header, or removing ads and trademarks from a page
Classifying pages according to the keywords in the content is a more granular technique that helps summarize the essence of the document. This technique is commonly used in Web page classification systems, such as Web page catalogs for portal sites.
"Content injection"
The two classes of transcoding described earlier typically reduce the content of a Web document, but there is another type of conversion that increases the content of the document, that is, content injection transcoding. Examples of content injection transcoding are automated ad generators and user tracking systems
Imagine how tempting and annoying it is to have an ad that automatically adds ads to every HTML page that goes through it. This type of transcoding can only be done dynamically-it must instantly add ads that are relevant to the current specific user or to a specific user. You can also build a user tracking system that dynamically adds content to the page to collect statistics on how the user looks at the page and how the client is browsing.
"Transcoding vs. static pre-generation"
The alternative to transcoding is to create different copies of Web pages on a Web server, such as HTML, one WML, one with high resolution and one with low resolution; One with multimedia content, one without. However, this approach is not very practical, for many reasons: any small change in a page involves many pages, requires a lot of space to store different versions of each page, and makes the page cataloging and Web server programming (to provide the correct version) more difficult. Some transcoding operations, such as ad insertion (especially directed ad insertion), cannot be implemented statically--because of what ads are inserted and the user requesting the page
Instant conversion of a single root page is an easier solution than a static pre-build. However, this will increase the time lag when the content is provided. Sometimes, however, some of these calculations can be done by a third party, which reduces the computational load on the Web server-for example, the conversion can be done by an external agent in the proxy or cache
Shows transcoding in the proxy cache
Frontend Learning HTTP Content Negotiation