An article that was previously published on another platform (http://www.jointforce.com/jfperiodical/article/1035) is now posted on its own blog.
Developing a Web container involves many different aspects of technology at different levels, such as communication layer knowledge, programming language level knowledge and so on, and a usable web container is a relatively large system, to say clearly need a long space, this article is to introduce how to design a Web container, only to explore the idea of implementation, Does not involve too many concrete implementations. It breaks down into modules and components, each of which is responsible for different functions, lists some basic components, and introduces each component.
Connection Receiver
The main function is to listen for a client socket connection and receive the socket, then the socket to the task executor (thread pool) execution. Constantly read the socket from the bottom of the system, do as little processing as possible, and throw it into the thread pool. Why stress the need to do as little processing as possible? This is related to system performance issues, and excessive processing can severely affect throughput. Because there is typically only one receiver (one thread is responsible for socket reception work), it is likely to have an impact on overall performance for the length of time each receive is processed. So the receiver's work is very small and simple, just maintain a few state variables, flow control gate accumulation operation, ServerSocket receive operation, set some properties of the socket received, put the received socket into the thread pool and some exception handling. Other logic that takes a long time to process is given to the thread pool, such as reading the underlying data of the socket, parsing the HTTP protocol message, and responding to the client's actions, and so on.
Number of connection controllers
for a machine, the total traffic to the access request is peak and the server has physical limits, in order to ensure that the Web server is not destroyed we need to take some measures to protect against, need to slightly explain the traffic here more refers to the number of socket connections, by controlling the number of socket connections to control traffic. One effective way is to take the flow control, it is like in the flow of the entrance to add a gate, the size of the gate determines the size of the flow, once the maximum flow will be closed to stop the gate to receive until the idle channel. Counters can be implemented using the AQS framework of the JDK.
Socket Factory
Different use situations may require different levels of security, such as the payment of related transactions must be encrypted after the transmission of information, which also involves the process of key negotiation, but in other general occasions do not need to encrypt the message. Reacting to the application layer is an issue with HTTP and HTTPS.
Simply speaking the TLS\SSL protocol provides authentication service for each communication ①, which authenticates the legitimacy of this session entity identity. ② provides cryptographic services, strong encryption mechanism to ensure that the communication process messages will not be deciphered. ③ provides tamper-proof service, uses the hash algorithm to sign the message, and verifies the signature to ensure that the communication content is not tampered with.
The HTTP protocol corresponds to the socket, while HTTPS corresponds to the sslsocket. How to generate sockets and sslsocket is referred to the socket factory.
Task Definition--task
define the tasks that need to be performed and tell the thread pool what tasks to perform. The task is divided into three main points: processing the socket and responding to the client, the number of connections counter minus one, closing the socket. The socket processing is the most important and the most complex, it includes the underlying socket byte stream reading, HTTP protocol Request message parsing (Request line, request header, request body and other information parsing), according to the request line resolution to find the corresponding host Web project resources, The HTTP protocol response message output to the client is assembled according to the processing results.
Task Executor
A thread pool with a maximum minimum number of threads is called a "task executor" because the thread pool can be seen as initiating several threads to constantly detect a task queue and execute if a task is found that needs to be performed. Maximum minimum number of threads limit, excess thread reclaim time limit, reject actions made by the thread pool when the maximum number of threads is exceeded, and so on.
message Read
Used to read messages from the client and provide buffering mechanisms to the underlying operating system. The message is copied to Desbuf.
Message Output
Used to write messages processed by the Web container to the underlying operating system and provide a buffering mechanism. Writes the message outputbuf through the buffer to the operating system.
Input Filter
in this reading process you want to do some extra processing, and these additional processing may be based on different conditions to do different processing, considering the program decoupling and expansion, so the introduction of filters. Through a layer of filters to complete the filter operation to the DESBUF, the process is like being added to a road processing level, through the level will be carried out the corresponding operation, the final completion of the source data to the destination data operation.
Output Filter
Similar to the input filter function for the time of message output.
Message Parser
provides the ability to parse various parts of the HTTP protocol.
Request Builder
According to the idea of object-oriented, abstract the request-related attributes and protocol fields into one request object. Includes the request line, the request header, the request body three parts of information, in the processing process needs the value to obtain directly from the request object. facilitates the implementation of the servlet standard.
Response Generator
A Response object builder is required to correspond to the request. Includes the response line, response header, response body three parts of information, the processing results related values can be directly set to the response object. facilitates the implementation of the servlet standard.
Address Mapper
The address mapper is a router that requests various web projects and individual resources. A request for access is mapped to the requesting client based on the path to which the source of the response is located.
Life cycle
for further modularity, the entire container has many components that may need to do different events at different times, requiring a unified lifecycle to manage all the components. For example, the start, stop, and shutdown of all components are all managed by the lifecycle and can easily manage the life cycle of these components. Do you want to do something before something happens to a certain state? Add a lifecycle listener to implement gracefully.
JMX Manager
monitoring and management of system operational status, server performance, server-related parameters collection, JVM load, number of Web connections, thread pool, database connection pool, cache management, configuration file reload, etc. can provide some remote visual management, high-real-time. It also provides a solution for the management of distributed systems.
Web Loader
Webloader is used to load a web App project, and a Web container may contain several web apps. In order to achieve LIB and servlet isolation, for each Web application to use a different ClassLoader classloader, and these classloader is not a parent-child relationship, to achieve the class isolation effect, that is, a Web application lib will not be used by other Web applications.
Session Manager
Session Manager mainly manages sessions, including: ① generate SessionID, general cookies or URLs without jsessionid values that do not exist, and need to regenerate SessionID as the session ID. ② Many client sessions are saved in the server, and periodic cleanup is made to ensure that server memory is not wasted for timed-out sessions. ③ can be persisted to disk for some important sessions and can be reloaded into memory for use when needed.
Run Log
Some warnings, exceptions, and errors are logged at run time.
Access Log
Access logs typically record client access related information, including client IP, request time, request protocol, request method, request bytes, response code, session ID, processing time, and so on. Access logs can be used to count the number of users visited, the distribution of access time, and personal hobbies, etc., which can help companies make decisions on their operational strategies.
Security Manager
Web projects run on the Web container platform, which is like embedding an application on a platform to run, so that the embedded program can run properly, first of all the platform to be safe and functional. and to maximize the platform is not affected by embedded applications, the two to a certain extent, to achieve the effect of isolation. The policy file is specified at startup by-djava.security.manager-djava.security.policy==web.policy, which defines the various permissions.
Operational Monitoring & remote Management
Provides a platform for real-time monitoring of the Web container's operational status and can be remotely managed.
Cluster
There are two types of clusters: ① load Balancing cluster, usually through a certain distribution algorithm to distribute access traffic evenly to the cluster inside the various machines for processing. ② high-availability clusters, cluster communications connect a number of machines, the cluster is more emphasis on when a machine in the cluster after a failure can be automatic switching or traffic transfer and other measures to ensure the overall availability of the cluster.
Web general requests are stateless, you can do the cluster directly, but involves the session is stateful, need to use the cluster communication technology for session copy. Related technologies include multicast, unicast.
servlet engine
The servlet engine uses reflection to generate objects from Servlets and JSPs in a Web application and put them into a pool of servlet objects, and invoke the appropriate method based on actual invocation. The Web application places the business logic processing in the Dopost or Doget method, and the Web container processes the request in accordance with the processing logic defined here, processing the response client.
JSP compiler
In accordance with the specification JSP is eventually compiled into a servlet execution, so to follow the specification of the JSP file compilation. JSP compiler is actually the JSP syntax translation, according to the JSP syntax processing.
A Web container basically contains the functionality of the components described above, and the implementation of each component module builds a Web container that allows your web to run.
Copyright NOTICE: This article for Bo Master original article, without Bo Master permission not reproduced.
How to design an available web container