Web Hosting for the Frontend learning HTTP

Last Update:2016-12-19 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Previous words

The responsibility for storage, coordination, and management of content resources is collectively referred to as Web hosting. Host hosting is one of the main functions of the Web server. Save and provide content, record access to content, and manage content without a server. If you do not want to manage the hardware and software required for your server, you need a hosting service, the trustee. This article describes in detail web hosting

Host Hosting

In the early days of the World Wide Web, each organization purchased its own computer hardware, built its own computer room, applied for its own network connection, and managed its own web server software. With the web rapidly becoming mainstream, everyone wants a website, but few people have the ability or time to build a server room with air-conditioning, register a domain name, or buy network bandwidth. In order to meet the urgent needs of people, there have been many new enterprises, providing a professional management of web hosting services. There are a variety of service levels, from physical device management (providing space, air conditioning and cabling) to full web hosting, where customers only need to provide content.

The following focuses on what services a managed Web server will provide. Much of what the site does (for example, its ability to support different languages and the ability to conduct secure e-commerce transactions) depends on the functionality provided by the hosted Web server

Suppose Joe's hardware store and Mary's Antique auction store need a website. Irene Network Service Provider There are a lot of racks, racks are all the same high-performance Web servers, can be rented to Joe and Maiy, so that they do not have to buy their own servers and Management Server software

In, Joe and Mary signed up to use dedicated web hosting services from Irene's network service provider. Joe rented a dedicated Web server that was purchased and maintained by the Irene Network service provider. Mary also rented another dedicated server from the Irene Network service provider. Irene Network Service providers Purchase server hardware in large quantities, and the hardware they choose is durable and relatively inexpensive. If Joe or Mary's website becomes more popular, Irene Network service provider can provide Joe or Mary with more servers right away

In this example, the browser sends an HTTP request to www.joes-hardware.com to the IP address of the Joe Server, sending a request to www.marys-antiques.com to the Mary server (different from Joe) 's IP address

Virtual Hosting

Many people want to show themselves on the web, but their website is not very traffic. For these people, using a dedicated Web server can be a bit of a waste, because the servers they rent for $ hundreds of a month are mostly free.

Many web hosts provide inexpensive web hosting services by having some customers share a single computer. This is called shared hosting or virtual hosting. Each Web site appears to be hosted on a different server, but is actually hosted on the same physical server. From an end-user perspective, Web sites that are hosted on a virtual site should be no different than those hosted on a dedicated server

In terms of cost-effectiveness, space, and management, a company that provides virtual hosting wants to host dozens of, hundreds, or even thousands of sites on the same server-but that doesn't necessarily mean that thousands of sites are serviced with a single PC. The trustee can create rows of the same server, called a server farm, and distribute the load on the servers in the cluster. Because each server in the group is the same and hosts many virtual websites, it is easier to manage.

When Joe and Mary start business, they may choose to host the virtual hosting to save money until the traffic on their site reaches a level worth using dedicated servers.

"Host Information"

Unfortunately, a design flaw in the http/1.0 can make the virtual hosting person mad. The http/1.0 specification does not provide any means for a shared Web server to identify which hosted Web site to access

Recall that the http/1.0 request sent only the path portion of the URL in the message. If you want to access http://www.joes-hardware.com/index.html, the browser connects to the server www.joes-hardware.com, but the http/1.0 request only mentions get/ Index.html, the hostname is not mentioned. If the server is virtual hosting multiple sites, there is not enough information to indicate which virtual Web site to access

If client a tries to access http://www.joes-hardware.com/index.html, the request get/index.html will be sent to the shared Web server

If client B tries to access http://www.marys-antiques.com/index.html, the same request get/index.html will also be sent to the shared Web server

As far as the Web server is concerned, there is not enough information to determine which site to visit. Although the request is a completely different document (from a different Web site), the two requests look the same because the host information for the site has been stripped from the request

Note HTTP alternatives (reverse proxies) and interception agents also require explicit site information

The missing host information is an oversight of the original HTTP specification, which mistakenly assumes that only one Web site is hosted on each Web server. The designer of HTTP does not provide support for a shared server hosting a virtual host. Because of this, the host name information in the URL is stripped as redundant information, only the Send path section is required

Because the early specifications did not take into account virtual hosting, Web hosts needed to develop workarounds and conventions to support shared virtual hosting. This problem can be solved simply by requiring all HTTP clear messages to send the full URL instead of just the path part. While http/1.1 does require the server to handle the full URL on the HTTP message request line, it will take a long time to upgrade the existing application to this specification. During this period, the following 4 technologies emerged

1. Virtual host hosting via URL path

You can isolate a virtual site on a shared server by assigning a different URL path, using this clumsy method. For example, you can give each logical site a dedicated path prefix

Joe's Hardware store can be: http://www.joes-hardware.com/joe/index.html

Mary's Antique Auction shop can be: http://www.marys-antiques.com/mary/index.html

When the request arrives at the server, there is no hostname information, but the server can differentiate them by path

Request Joe's Hardware store URL is get/joe/index.html

The website of Mary's Antique auction shop is get/mary/index.html

This is not a good idea. Prefixes such as/joe and/mary are redundant (the host name already mentions Joe)

To make things worse, describe common conventions for page links: http://www.joes-hardware.com or http://www. Joes-hardware.com/index.html's not going to work.

In summary, virtual hosting by URL is a bad solution and rarely uses

2. Host hosting via port number

In addition to modifying the pathname, you can assign different port numbers on the Web server for Joe and Mary's sites. Instead of using port 80, use a different port number, for example, Joe uses the 82,mary 83. But this solution also has the same problem: end users will not be happy to specify non-standard port numbers in the URL

3. Host hosting via IP address

Assign dedicated IP addresses to different virtual sites and bind them to a single machine. This allows the Web server to identify the site name by IP address.

A more common and better approach is to virtualize through IP addresses. Each virtual site is assigned one or more unique IP addresses. The IP addresses of all virtual sites are bound to the same shared server. The server can query the destination IP address of the HTTP connection and use this to determine the target Web site of the client

Say The trustee assigns the IP address 209.172.34.3 to the www.joes-hardware.com, assigns the IP address 209.172.34.4 to the www.marys-antiques.com, and binds the two IP addresses to the same physical server. The Web server can use the destination IP address to identify which virtual site the user is requesting.

Client A gets http://www.joes-hardware_com/ index.html; Client A polls the IP address of the www.joes-hardware.com, gets 209.172.34.3; Client A opens a TCP connection to the shared server with the destination address 209.172.34.3; Client A sends a request with the content of get /index.html http/1.0; Before the Web server provides a response, it notices the actual destination IP address (209.172.34.3) and determines that this is the virtual IP address of Joe's Hardware website, which completes the request based on the subdirectory/joe. The returned file is/joe/index.html

Similarly, if client B requests http://www.marys-antiques.com/index.html. Client B Polls the IP address of the www.marys-antiques.com, gets 209.172.34.4; Client B opens a TCP connection to the Web server with the destination address 209.172.34.4; client B sends the request, the content is get/ index.html Http/1.o;web server to determine that 209.172.34.4 is Mary's Web site, according to the/mary directory to complete the request, is to ask the file/mary/index.html

Hosting a virtual IP can work for a large trustee, but it can cause some trouble

A, virtual IP addresses that can be bound on a computer system are usually limited. Service providers who want to host hundreds or thousands of virtual sites on shared servers may not be able to fulfill their wishes

b, IP address is a scarce resource. A host of virtual sites may not be able to get enough IP addresses for a hosted website

C, when the trustee increases capacity by replicating the server, the problem of IP address shortage becomes more serious. Depending on the load balancing system, different virtual IP addresses may be required on each replicated server, so the demand for IP addresses may multiply with the number of replication servers

Although host hosting of virtual IP has a problem of consuming addresses, it is still widely used

4. Virtual Hosting via host header

To avoid excessive address consumption and virtual IP address limitations, we want to share the same IP address across virtual sites and still differentiate between sites. But as we can see, because most browsers just send the URL path to the server, the critical virtual hostname information is discarded.

To solve this problem, browsers and server implementations extended HTTP to provide the original host name to the server. However, the browser cannot send only the full URL, because this causes many servers that can only receive paths to not work. The alternative is to place the hostname (and port number) in the host extension header of all requests

, both client A and client B send a host header that carries the original hostname to be accessed. When the server receives a request for/index.html, the host extension header can be used to determine which resource to use

The host header was first introduced in http/1.0+, and it is a developer-implemented http/1.0 superset of the extensions. The host header must be supported in accordance with the http/1.1 standard. Most modern browsers and servers support host headers, but there are still some clients and servers (and network bots) that do not support it

Host Header

The host header is the request header for http/1.1, which is defined in RFC 2068. Due to the popularity of virtual servers, most HTTP clients (even those who do not follow http/1.1) implement the host header

The host header describes the Internet host and port number on which the requested resource resides, as obtained in the original URL:

" Host " " : " " : [ Port]

However, note the following issues: If the host header does not contain a port, the default port in the address scheme is used, and if the URL contains an IP address, the host header should contain the same address, and if the URL contains a host name, the host header must contain the same name, if the URL contains a host name, The host header should not contain the IP address of the hostname in the URL, because it disrupts the work of the virtual host server, which stacks many virtual sites on the same IP address, and if the URL contains a host name, the host header should not contain other aliases for that hostname. Because this also disrupts the work of the virtual hosting server, if the client explicitly uses a proxy server, the client must place the original server, not the name and port of the proxy server, in the host header. In the past, several Web clients, when enabling client proxy settings, mistakenly set the issued host header to the proxy hostname. This error behavior causes the agent and the original server to be unable to process the request properly, and the Web client must include the host header in all request messages; The Web proxy must add the host header before forwarding the request message; http/ 1.1 of Web servers must use 400 status codes to respond to all http/1.1 request messages that are missing the host header field

Here is a simple HTTP request message to get the Www.joes-hardware.com home page with the required Host header field

There is a small number of older browsers in use that do not send the host header. If a virtual hosting server uses the host header to determine which Web site it is serving, and the host header does not appear in the message, it may either direct the user to a default Web page (such as a Web service provider's website), or it may return an error page that advises the user to upgrade the browser

The value of the Host header field can be ignored for the original server that is not hosting the virtual host and does not allow the resource to vary with the requesting host. However, the original server, which varies depending on the hostname, must use the following rules when a http/1.1 request is judged on the resource it is requesting

1. If the URL in the HTTP request message is absolute (that is, contains the schema and the host part), the value of the host header is ignored

2. If the URL in the HTTP request message does not have a host part, and the request has a master header, the host/port value is taken from the master header

3. If a valid host cannot be obtained through step (1) or step (2), return the response of the request to the client

Some versions of the browser send the host header incorrectly, especially when configuring the use of proxies. For example, when configuring the use of proxies, some older Apple and Pointcast clients mistakenly put the name of the proxy, not the name of the original server, in the host header to send

Website operation

In these time periods listed below, the Web site is usually not working: When the server goes down, traffic jams: Suddenly many people have to watch a special news broadcast or rush to a big sale shop. Sudden congestion can overload the Web server, reduce its responsiveness, and even make it completely shut down; network outage or drop

Here are some examples of how to pre-contract and deal with these common problems

"Mirrored server Cluster"

A server cluster is a row of identically configured Web servers that can be replaced by each other. The content on each server can be replicated by mirroring, so that when a server goes wrong, the other can be on top

Mirrored servers are often composed of hierarchical relationships. A server may act as a "content authority"-it contains the original content (possibly the server that the content author uploads). This server is called the primary origin server. The mirror server that receives content from the primary origin server is called the Replication origin server (replica origin server). A simple way to deploy a server cluster is to use a network switch to distribute the request to the server. The IP address of each Web site hosted on the server is set to the IP address of the switch

In the mirrored server cluster that is displayed, the primary original server is responsible for sending the content to the replication origin server. For the outside of the cluster, the IP address where the content resides is the IP address of the switch. The switch is responsible for sending the request to the server.

A mirrored Web server can contain copies of the same content at different locations. 4 mirrored servers were shown, with primary servers in Chicago, and replication servers in New York, Miami and Little Rcok. The primary server serves clients in the Chicago area and is tasked with propagating content to replication servers

In this scenario, there are two ways to direct client requests to a specific server

1. HTTP redirection

The URL of the content resolves to the IP address of the primary server, and then it sends the redirect to the replication server

2. DNS Redirection

The URL of the content resolves to 4 IP addresses, and the DNS server can choose the IP address to send to the client

"Content distribution Network (CDN)"

Simply put, a content distribution network (CDN) is a specialized network for distributing specific content. A node in this network can be a Web server, a reverse proxy, or a cache

1. Reverse Proxy Cache in CDN

Copying the original server can be replaced with a reverse proxy (also known as a substitute) cache. The reverse proxy cache can accept server requests just like the mirror server. They represent a specific collection in the original server to receive server requests. Depending on the ad mode of the IP address where the content resides, there is usually a collaboration between the original server and the reverse proxy cache, and requests to a particular origin server are received by the reverse proxy cache.

The difference between a reverse proxy and a mirror server is that the reverse proxy is typically demand-driven. They do not save all copies of the original server's content, they only save the part of the client request. The distribution of content in their super cache depends on the requests they receive, and the original server is not responsible for updating their content. To make it easier to access "hotspot" content (that is, content with high request rates), some reverse proxies have a prefetch feature that allows you to upload content from the server before the user requests it

When a CDN has a reverse proxy, it may increase its complexity due to a hierarchical relationship of agents

2. Proxy Cache in CDN

Unlike a reverse proxy, a traditional proxy cache can receive requests to any Web server. There is no need to have any working relationships or IP address conventions between the proxy cache and the original server. However, compared with the reverse proxy, the content of the proxy cache is generally driven on demand and cannot be expected to be an exact copy of the original server content. Some proxy caches can also pre-load hotspot content

An on-demand proxy cache can be deployed in other environments-especially interception environments-in which case a layer 2 or Layer 3 device (switch or router) intercepts web traffic and sends it to the proxy cache

The interception environment relies on the ability to set up the network between the client and the server. In this way, all appropriate HTTP requests can be actually sent to the cache. Distributes content in the cache based on received requests

"Accelerated Access"

Many of the technologies mentioned above can also help the website to load more quickly. server clusters and distributed proxy caches or reverse proxy servers distribute network traffic to avoid congestion. Distribute the content closer to the end user so that the transfer time from the server to the client is shorter. Request and response through the Internet, the way the client and server transfer is the most important factor affecting the speed of resource access

Another way to speed up site access is to encode content for faster transmission. For example, the content is compressed, but only if the receiving client can extract the content

Web Hosting for the Frontend learning HTTP

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Web Hosting for the Frontend learning HTTP

Contact Us

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support