Web server architecture

Source: Internet
Author: User

Basic knowledge Popularization:

Http:hypertext Transfer Protocol: Hypertext Transfer Protocol

It not only ensures that the computer can transfer hypertext documents quickly, but also determines which part of the document is being transmitted, and which content is displayed first (such as text before the graphic), and so on.

The HTTP protocol is an application-level protocol that consists of requests and responses and is a standard client server model. HTTP is a stateless protocol.

The protocol that can implement jump in the document;


Web: The ability to allow everyone to implement the HTTP protocol, the version used is 0.9 version;

http/0.9: Supports only plain text (hyperlink) format, ASCII;

HTML: Language for writing hypertext; HYPER TEXT Mark Language;

The user can become a fixed format when browsing through the client browser.

Browser (browser): Client


Resources:

Like what:

1.1.1.1: Server provides a Web page, called a.html;

2.2.2.2: The server also provides a Web page, also known as a.html;

But relying solely on the name of the document does not allow the user to identify different documents.


Then came the following:

Uri:unifom Resource Indentifier; the way in which a service can be referenced globally on the Internet; Uniform Resource Locator.

Unity: Unity in the path format; called a Uniform Resource identifier; A Uniform resource identifier has a child object called a URL. Uniform Resource Locator.

URL: Uniform Resource Locator.


Protocol://host:port/path/to/file


Http://www.zledu.com/download/linux.tar.gz Description: Protocol name: host name; path



Resources: Each picture is a resource;

Multiple resources are likely to be integrated into an HTML document;

Web objects are the same concept as Web resources.


Method of resource access: HTTP method; You can integrate resources across different hosts and then display them on the same page.

Now that the page is composed of resources, how do we get the resources on the server when we access the server? At the moment, we can both place local data

To the remote, you can also display the remote to the current page. Resources are therefore obtained in different ways. We call it the HTTP method.


http;0.9 version only This method is called get, and gets the resources of the remote server to local. displayed through the browser.

There are several methods of HTTP protocol 1.0: Put,post,delete;

Put: Modify the contents of the server;

POST: Submits the data to be processed to the specified resource, and is processed through the form.

Get: Request to obtain the resource identified by Request-uri, request data from the specified resource, and get the resource directly from the server to local;

Delete: Request Server Delete Request-uri

There are 8 methods in the HTTP protocol, the most commonly used method is the above method types.



Mime:multipurpose Internet Mail Extension, multi-purpose Internet Message extension. Re-encode non-text data before transmission to text format,

The receiver can revert back to its original format in the opposite way, and it can call the appropriate program to open the file.

Smtp:simple Mail transmission Protocol, plain text early transfer messages can only transmit plain text data.


The HTTP protocol introduces the MIME protocol into it, causing the browser to have access to a wide variety of data. The browser parses the various documents of the MIME transmission in the form of a plug-in. such as PDF.


Dynamic Web page: It is due to the advent of plug-ins, the emergence of dynamic Web pages.


Dynamic Web pages: server-side stored documents are non-HTML, but scripts developed by programming languages. The script takes the parameter and runs it once on the server,

After the completion of the run, an HTML-formatted document is generated, and the generated document is sent to the client, so the server side needs to run the interpreter.

The pressure on the server side will be relatively large.


The Web server is not able to help us execute these scripts? Web servers do not help us execute these scripts the Web server needs to be implemented with additional tools.

There are apache,ngix,lighttpd,tomcat common to Web servers.


such as: asp,php. End of. The Web server will invoke PHP's interpreter according to some protocol and let it run the index.php script. The resulting document is then returned through the protocol to the

Web server. The Web server is then returned to the client. The Web server is just an HTTP server.


Explain the entire process of user access to the Web server in detail?

Simple point-of-access principle: The client user sends an HTTP request after the HTTP request arrives at the server. First enter the server's kernel space, the server's kernel space, to parse it.

Discover a Web request in fact. It then gives the native user process an HTTP process. The HTTP process then sends a request to kernel space, and the kernel space will

The data to be accessed is extracted from the hard disk. Returns the user space process to us. The user-space process then sends its contents to the kernel space.

Kernel space is sent to the client through a layer-by-layer transformation.




Complex point Access principle: The client user sends an HTTP request after the HTTP request arrives at the server. First enter the server's kernel space, the server's kernel space, to parse it.

Discover a Web request in fact. It then gives the native user process an HTTP process. The HTTP process discovers that it is going to access dynamic Web pages, through other protocols,

The program that invokes the Dynamic Web page first, the Dynamic Web page interpreter, then sends the content to the kernel space, and the kernel space extracts the data we want to access from the hard disk for processing. Returned to us.

The interpreter process for user space. The interpreter process then returns the resulting document to the Web user process through the protocol. The Web server then sends it to our kernel

。 Kernel space is sent to the client through a layer-by-layer transformation.


A user access we need to open a process, 10,000 users come to visit the need to open more processes. More system resources are needed at this time.



Dynamic Web pages: contains static content and dynamic content. When the dynamic program finishes running, it is returned to the client.


After HTTP protocol 1.0, these are all available to meet these needs. Of course we also introduced the caching mechanism in 1.0.

From a purely static page:

Www.zledu.com first uses DNS to resolve the FQDN to an IP address, and then accesses the host. After the server-side accepts the request.

Two ways: blocking: Always in the waiting state;

Non-blocking: It takes a while to come over to request access. Polling mode;

IP header:

Source IP

Destination IP

Tcp

Source Port

Destination Port

HTTP header: Clearly define what I want to access those resources based on this host;

Get/2.html

HOST:wwwzledu.com (Virtual Host)


Message in HTTP format: Request message, Response message.


Request Message Syntax:

<method> #资源获取方法 <request-URL> #您请求的资源是什么 <version> #对应请求协议的版本号;

#空白行是必须的;

<entity-body> #报文主体;


Response Message Syntax:

<version> #对应的协议版本 <status> #状态码 to indicate if the result is correct;<reason-phrase> #具体解释原因 # # # #请求行

# # #空白行必须的

<entity-body># #报文主体


Status Code classification:

1XX: pure information, seldom used;

2XX: "Success class" information; (200,201 ...)

3XX: redirection class information; (301,302,304 ...)

4XX: Client error class information; (404, server Side does not exist ...) )

5XX: Server-side error class information; (500,501 ...)



For example:

Request message:

get/http/1.1 #GET后面什么都没有, the default access to the other side of the main page information;

Host:www.zldu.com #

Connection:keep-alive

Response message:

http/1.1 OK #对应的协议版本, status code, followed by Reason-phrase.

X-powered-by:php/5.2.17#header Content

vary:accept-encoding,cookie,user-agent#

Cache-control:max-age=3, must-revalidate#

content-encoding:gzip# content encoding mechanism

content-length:6931# the length of the content format



The first line of the above two messages is often referred to as the "start line" of the message, and the contents of the following label format are called the Header field (header field),

Each header field consists of a name and a value (value) separated by commas.

In addition, the response message usually has a body of information called the body, that is, the content of the response to the client.



What are the main operations of the Web server:

1, Establish the connection-accept or reject the client connection request;

2, receive the request--read the HTTP request message through the network;

3, processing request-parse the request message and make the corresponding action;

4, access to resources-access to the relevant resources in the request message;

5, Build the response--use the correct header to generate HTTP response messages;

6, Send the response-send the generated response message to the client;

7, logging-When the completed HTTP transaction logs into the log file;



To sum up, to develop a Web server is also very simple, only need to be able to decode protocols, respond to requests, access resources, build messages, log logs can be.


So what exactly does the cache do?

Imagine a scenario where we need to get a Web page that includes 10 image,3 CSS styles and 5 htmls. Then this page contains 18 requests;

These 18 requests are a single request or a request. Note: Each resource is individually requested and transmitted separately. So I'm going to open a Web page

Send n requests (because each resource is requested separately). When we open a relatively slow site, are the first text, after the appearance of the picture. actually picture

The content is too large to send more slowly. So in order for clients to open our website faster, most of our browsers are multi-threaded. For example, IE6 is a 4 thread.

Google Chrome is a 2-thread. This allows each thread to send a request, and multiple resources are pulled locally at the same time. This way the speed of looking at the page will be a little faster.



The HTTP protocol is a TCP-based protocol. Three-time handshake, four-time disconnection.

In this case, the server-side pressure is very large, so you need a caching mechanism. Cleaning up the rubbish will increase our traffic.

The mechanism introduced in http1.0 has a caching mechanism.

http1.1 version:

Enhanced the function of the cache;

A mechanism for long connections is introduced. (indicates that after a resource is fetched between the client and the server, it is continuously open, continues to acquire a second resource, and so on).

In most cases, using long connections can greatly improve the performance of the server.

Sets the timeout period for long connections.



How the Web server handles concurrent connection requests:

1. Single-threaded Web server (single-threaded Web servers) (single process)


In this way, the Web server processes one request at a time, and then reads and processes the next request at the end. During the processing of a request, all other requests are ignored.

As a result, serious problems can occur in scenarios where there are more concurrent requests.




2, multi-process/multi-threaded Web server


In this way, there is usually a server process working, and not responding, generating a child process.

A Web server generates multiple processes or threads that process multiple user requests in parallel, and processes or threads can be generated on demand or in advance.

Some Web server applications respond with a single process or thread for each user request, and the advantage is that stability is also possible. Early

A large number of servers are using this type of architectural approach. However, once the number of concurrent requests reaches tens of thousands, multiple concurrently running processes or

Threads will consume a lot of system resources.




3. I/O multiplexing Web server (this mechanism is that one person is responsible for multiple user requests.) How is it done? A process to complete the response request of multiple users, the process is only responsible for the user request, the incoming.

By polling the status check mechanism, every fixed time to query the response of the specific situation can be. When one is found to be complete, it responds to a fixed customer.

The effect is not very good. The optimization is an event-handling mechanism. Each customer response has a status code that only looks at the status code. But it also needs to be scanned again, and that's a lot of overhead. The Third Kind is

When the state changes, the initiative to inform me, self-management themselves. This allows a process to process multiple requests, and each request has its own state, and this request can also send me a notification. Send

There are two ways to notify: Horizontal trigger and Edge trigger. The level trigger is only notified once, regardless of whether you are processing. Well, it's like you're at a mall, you're booking a bowl of rice at a store, and after a good meal, on the big screen

Rolling broadcast once. An edge trigger is a periodic notification that is sent until the process is processed by the edge. But neither of these is good, and the third trigger mechanism is introduced, sending notifications directly to the process itself. Similar to SMS notifications)


To be able to support more concurrent user requests, more and more Web servers are using a multiplexed architecture-synchronous monitoring of the activity state of all connection requests,

When the state of a connection changes (such as when the data is ready or an error occurs), a sequence of specific actions is performed for it, and after the operation is complete, the connection is re-

Change back to a temporary stable state and return to the list of open connections until the next status changes. Because of its multiplexed nature, the process or thread will not be idle

Connection, thus providing an efficient mode of operation. But if the process is serving too many users, and multiple users are notifying the process at the same time, I am well, and the process responds to the user's request or is strained.



4, multiplexing multi-threaded Web server (there is a process called the master process, does not respond to any request work.) is only responsible for connecting the response in. Assigned to the process you manage,

To a self-replicating process, which is used by itself to manage, the user requests, to which process, each process can accept multiple user requests. A process handles a fixed number of user requests,

Copy multiple processes when the number of requests is large. When the request is not much, destroy the extra process and leave a few processes. )


A Web server architecture that combines multi-process and multiplexed functionality to avoid having a process serve too many user requests and take advantage of the computing power provided by multi-CPU hosts.




Summarize:

Web server is a model of C/s architecture or B/s architecture:

C (clients) (Client Agent, BROWSER (browser), spider)

client--"Request-->server

Url

S (Server):

Server-->response--->client


HTTP Method:get,head,post,put,delete,trace,options,connect. The most commonly used method is the first three kinds.

HTTP headers (very many HTTP header categories):

Name:values

Host:www.zledu.com

connection:keep-alive;

HTTPD Software: The model provided has prefork,work,event. Three models correspond to the different architectures above.


HTTP client type: Ie,firefox,chrome,opera,safari ...

Server-side type: apache,iis,ngix,lighttpd,thttpd ....


Application server: This kind of server not only can handle static content, but also can handle the dynamic content of a certain format. Commonly used are: Iis,tomcat (support parsing Java) Open source, Websphere (ibm,jsp) commercial products.

WebLogic (oracle,jsp,commodity); JBOSS (REDHAT): The core is a tomcat that can provide a Web server.


www.netcraft.com Here you can see the proportion of the most recent Web servers on the global Internet.

Discover Ngix is the most bull in all servers doing reverse proxies. After the article I will be around ngix to spread.



Agent


The Web Proxy server works between the Web client and the Web server, and it is responsible for receiving HTTP requests from the client and forwarding them to the corresponding service, and then receiving the response from the service side and echoing the response message back to the client.






This article is from the "Sweat Achievement Dream" blog, please be sure to keep this source http://redhatdragon.blog.51cto.com/9183870/1441261

Web server architecture

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.