Web server/Web
application server often appear together on the network. This article will show the reader how to distinguish these two similar concepts.
Alibaba Cloud Simple Application Server: Anti COVID-19 SME Enablement Program
$300 coupon package for all new SMEs and a $500 coupon for paying customers.
1.1. Web server concepts and basic principles
1.1.1. Web server history
In 1989, Berners-Lee, the father of the Internet, proposed a new project to his employer CERN to ease the exchange of information between scientists by using hypertext systems. This project led Berners-Lee to write two proposals in 1990:
A browser called WorldWideWeb.
The world's first
web server, later called CERN httpd, it runs on NeXTSTEP
Between 1991 and 1994, the simplicity and effectiveness of the early technology used to surf and exchange data through the World Wide Web helped to port it to many different operating systems, use it in scientific organizations and universities, and then spread To the industry.
In 1994, Berners-Lee decided to form the World Wide Web Consortium (W3C) to manage the further development of many technologies involved (HTTP, HTML, etc.) through a standardized process.
The main function of the
web server is to store, process and deliver web pages to clients. The communication between the client and the server is carried out using the Hypertext Transfer Protocol (HTTP). The most common delivered pages are HTML documents, which may contain images, style sheets and scripts in addition to text content.
A user agent, usually a web browser or web crawler, obtains server resources by initiating an HTTP request, and the server returns the resource according to the request or responds to an error message for some reason. The resource is usually a real file on the server-assisted storage, but this is not necessarily the case and depends on the implementation of the web server.
Although the main function is to provide content, the complete implementation of HTTP also includes a way to receive content from the client. This function is used to submit web forms, including uploading files. Many general-purpose web servers also support server-side scripts using Active Server Pages (ASP), PHP or other scripting languages. This means that the behavior of the web server can be scripted in a separate file, while the actual server software remains unchanged. Usually, this function is used to dynamically generate HTML documents ("on the fly") instead of returning static documents. The former is mainly used to retrieve or modify information from the database. The latter is usually much faster and easier to cache, but cannot provide dynamic content.
Web servers are not only used to serve the World Wide Web. They can also be embedded in devices such as printers, routers, network cameras, etc., and only serve the local network. The web server can then be used as part of the system for monitoring or managing the device in question. This usually means that no additional software is required on the client computer, as only a web browser is required (most operating systems are now included).
1.1.2. How Web Server Works
The HTTP protocol is based on the TCP protocol and is an application layer protocol for communication between the user agent and the Web server.
Web servers usually work in a question-and-answer manner:
The user initiates a resource request on the user agent. The content of the request includes but is not limited to: the unique identification URI of the specified resource, and the action type (GET/POST/DELETE/PUT...)
The user agent parses the user's input URI and obtains the target domain name from it, which is then resolved by the DNS server. If an IP address is specified in the URI, this step is not necessary.
If the session with the server has not been established, first establish a TCP connection and complete the HTTP negotiation (determine the acceptable processing method for both parties, including protocol version, encryption, content format, etc.).
The user agent encapsulates the request content into an HTTP data packet and sends it to the server.
The server receives the resource request and unpacks and processes it in the previously negotiated manner.
The resources requested by the server are encapsulated into HTTP data packets and returned to the user agent.
Next, I will focus on the working principle of the server side
TCP listener module
The server listens to a port (usually port 8080 by default, users can set other ports) to establish a connection with the user agent. Once the connection is established, subsequent HTTP requests from the user agent will not need to enter the monitoring module.
Pretreatment
There are three main things to do here: 1. Get the HTTP request message from the TCP message. 2. Perform decryption, decompression, security processing, etc. according to the negotiation with the user agent. 3. Perform security processing, establish session state, etc. according to the server's own configuration.
UR routing
The URL string and actions are parsed to determine the resource requested by the user agent, and routed to the static resource processing module or dynamic resource processing module according to matching rules (usually based on regular expression + suffix).
Static resource processing module
Responsible for finding static resources, such as HTML/Javascript/CSS files/pictures/images, determining whether the content is a character stream or byte stream, and determining the corresponding MIME, such as HTML generating MIME as text/html character stream, mpeg video file generating MIME as The byte stream of video/mpeg.
Dynamic resource processing module
Run business logic processing, dynamically determine the content and type of the returned resource, and the principle of processing the content and type is the same as above.
Post-processing
Encryption, compression, security processing and so on according to the agreement negotiated with the user.
Resource output module
Encapsulate the processed content and type into an HTTP message, and send a TCP message to the user agent at the other end of the TCP connection (the content is an HTTP message).
Mainstream web server
Including Apache, IIS, Nginx, and more use of Tomcat, Jetty, WebSphere, WebLogic, Kerstrel, etc.