In the last two months of my spare time writing a private project to write a high-performance Web server under Linux, named Zaver. The main framework and basic functionality have been completed, and some of the advanced features will gradually increase in the future, with the code on GitHub. Zaver's framework approaches industry levels as little as possible, unlike some textbook toy server that abandons many of the things that the server should have in order to teach the principles. In this article, I will step-by-step to clarify the design of zaver and the difficulties encountered during the development process and the corresponding solutions.
Why do you have to make wheels again?
Almost everyone has to deal with the Web server more or less every day, the more famous Web server has Apache Httpd, Nginx, IIS. These software run on thousands of machines to provide us with a stable service, when you open the browser input URL, the Web server will send the message to the browser and then present in front of the user. Since there are so many ready-made, mature and stable Web servers, why do we have to reinvent the wheel, I think the reasons are as follows:
Consolidate the foundation. A good developer must have a solid foundation, and making wheels is a great way to do it. Learn compilers? Read the textbook and write one. Learn the operating system? Write a prototype out. This field of programming only dares to say that it will be true if it is done by itself. Now I'm learning network programming, so I'm going to write a server.
Implement new functionality. Mature software may be in order to adapt to the needs of the general public will not consider you a person's special needs, so you can only implement this special needs. The Nginx has done quite well in this regard, it provides the user-defined module to customize the functions that they need.
Help beginners to easily master the architecture of mature software. For example Nginx, although the code is very beautiful, but want to fully understand its architecture, as well as some of its custom data structure, to find a considerable amount of information and reference books, and these structures and data structure is designed to improve the scalability and efficiency of the software design, independent of high concurrent server, the essential part of Beginners will be confused. And zaver with the least code to show a high concurrency server should look like, although no nginx performance is high, the benefits are no nginx so complex, the server architecture completely exposed to the user.
The server on the textbook
Learning Network Programming, the first example might be a TCP echo server. Presumably the idea is that the server will listen on a port, call accept to wait for the client's connect, and so on when the client connects, it will return an FD (file descriptor), read from FD, write the same data to this FD, Then re-accept, find a very good code implementation on the Web, link here. If you do not understand the program, you can download it to a local compilation run, with the Telnet test, you will find in the Telnet, what will be displayed immediately. If you have not been exposed to network programming before, you may suddenly realize that this and the browser to access a URL and then the information displayed on the screen, the whole principle is the same!
After learning how this echo server works, it is very easy to expand into a Web server based on this, because HTTP is built on top of TCP, with more than a few protocol parsing. The client sends the HTTP protocol header and (optional) data after the TCP connection is established, the server interprets the HTTP protocol header after receiving the data, sends back the corresponding data according to the information in the protocol head, the browser displays the information to the user, and a request is completed.
This method is some books teach network programming standard routines, such as "in-depth understanding computer system" (CSAPP) in the network programming with this idea to achieve a simple server, code implementation here, very short, worth reading, In particular, this server realizes the static content and realizes the dynamic content, although the efficiency is not high, but has achieved the teaching goal. The book then optimizes the server with event-driven information about the event drivers that are later spoken.
Although this program works properly, it is completely inaccessible to industrial use because the server is unable to accept other customers when processing a customer request, which means that the program cannot satisfy two users who want the Echo service at the same time, which is intolerable, imagine you are using, Then tell you that someone is using it and you have to wait for that person to go before you can use it.
Then an improved solution was proposed: After the Accept fork, the parent process continues to accept, the subprocess to process the FD. This is also a standard example of some textbook, the sample code here. On the surface, this program solves the problem that can only deal with single customers in the front, but it is not able to be used in high concurrent use of industry based on the following main reasons.
Each time a connection is fork, the overhead is too high. Any book that says operating system will write that threads can be understood as lightweight processes, where exactly is the process going? A section of Linux Kernel Development (CHAPTER3) specifically describes what the system does when it calls fork. The address space is copy on write, so it does not cause overhead. But there is one operation that replicates the parent Process page table, which is why creating a process under Linux is much more expensive than creating a thread, and all threads share a page table (think about why the address space is cow but the page table is not cow).
Process Scheduler pressure is too high. When concurrency comes up, there are thousands of processes in the system, and a considerable amount of time will be spent deciding which process is the next running process and context switching, which is not worth it.
Multiple processes consume too much memory under the heavy load, and each connection has a separate address space under the process, even if each connection is occupied independently of the thread. In addition, the parent-child process needs to occur between IPC, high concurrency under the IPC brought by overhead can not be ignored.
Swapping threads solves the problem of fork overhead, but the problem with the scheduler and memory is still unresolved. So processes and threads are essentially the same, known as the Process-per-connection model. Because it cannot handle high concurrency and is not used by the industry.
A very obvious improvement is to use the thread pool, the number of threads fixed, there is no problem mentioned above. The basic architecture is that a loop is used to accept the connection, and then the connection is assigned to a thread in the thread pool, and the thread can handle the other connections after the process is finished. It seems like a very good solution, but in reality, many connections are long connections (multiple communication on a TCP connection), a thread after receiving a task, processing the first batch of data, this time will call read again, God knows when the other side to send new data, So this thread is blocked by this read (because the FD is blocking by default, that is, if there is no data on this FD, the call to read will block the process), nothing can be done, false with n threads, the first (n+1) long connection, or cannot be processed.
What to do? We found that the problem was that the thread was blocked in read, so the solution was to replace the blocking I/O with non-blocking I/O, when read would return the data if there was data, and if there was no readable data, return 1 and set the errno to Eagain, The next time I have the data, I'm going to read it again (Man 2 read).
Here's a question, how does the process know when the FD comes to the data and can read it? Here is a key concept, event-driven/event loop.
Event-driven (Event-driven)
If there is such a function, when an FD can read the time to tell me, instead of repeatedly to call read, the above problem is not solved? This is called event-driven, and can be implemented in Linux with Select/poll/epoll these I/O multiplexing functions (man 7 epoll), because you want to keep in mind which FD is readable, so put this function in a loop, This is called the event loop. A sample code is as follows:
while (!done){ intmax(1000, getNextTimedCallback()); int retval = epoll_wait(epds, events, maxevents, timeout_ms); if0) { 处理错误 else { 处理到期的 timers if0) { 处理 IO 事件 } }}
In this while, the call will epoll_wait
block the process until the event that was registered at the time of the FD in Epoll occurred.
Here's a very good example of how epoll is used.
It should be noted that the Select/poll is not scalable, the complexity is O (n), and the complexity of Epoll is O (1), in Linux under the industrial process is epoll (other platforms have their own API, such as in the freebsd/ MacOS uses kqueue) to inform the process which FD has occurred, as to why Epoll is more efficient than the previous two, please refer here.
Event-driven is the key to implementing high-performance servers, such as Nginx,lighttpd,tornado,nodejs, which are event-driven implementations.
Zaver
In conjunction with the above discussion, we have drawn a solution to the event loop + non-blocking I/O + thread pool, which is also the subject architecture of Zaver (synchronous event Loop +non-blocking I/O is also known as reactor model).
The event loop is used as an event notification, and if LISTENFD is readable, call accept to add the newly created FD to the epoll; it is the normal connection FD, which is added to a producer-consumer queue, and so on to work threads.
The thread pool is used for calculations, from a producer-consumer queue to take a FD as the calculation input, until the eagain is read, the current processing state (state machine) is saved, and the next notification of the FD read-write event is waited for the event loop.
Problems encountered in the development
The operating architecture of Zaver is described above, and the following summarizes some of the difficulties I have encountered in development and some solutions.
The difficulties encountered in the development of the record is a very good habit, if you encounter problems to find a solution to Google to copy the past, do not make any records, and no thinking, then the next time you encounter the same problem, or will repeat the search process. Sometimes we want to be the creator of the code, not the "Porter" of the code. Doing records regularly reviews the problems you encounter and makes you grow faster.
- If the FD is put into the producer-consumer queue, the worker who gets the task has not finished reading this FD, because the data is not read, so this fd is readable, then the next event loop returns this FD, and then to another thread, how to handle?
A: Here are two working modes of Epoll, one called Edge triggering (edge triggered) and the other called horizontal trigger (level triggered). The names of ET and LT are very graphic, et is indicated when the state changes (eg, from low to high on the edge), and LT indicates that it is notified in this state (eg, as long as the low level is notified), corresponding, in epoll, et means that as long as there is new data on the notification (state change) and "As long as there is new data" will always be notified.
To give a specific example: if there is 2kb of data on an FD, the application will not return at the next time when it is read 1kb,et, and new data will be returned when it is epoll_wait
finished. And LT will return this FD every time, as long as the FD has data to read. So in Zaver we need to use the Epoll ET, the mode of usage is fixed, the FD is set to nonblocking, if a FD is returned to read, loop read until Eagain (if read returns 0, the remote closes the connection).
- When the server and browser keep a long connection, the browser is suddenly closed, then the server side how to handle the socket?
A: At this time the FD returns a readable event in the event loop, and then it is assigned to a thread, and the thread read returns 0, which means that the other side has closed the FD, and the server side also calls close.
- Since the socket's FD is set to non-blocking, if some packets are late, read will return -1,errno set to Eagain, waiting for the next read. This is a blocking read has not encountered the problem, we must save the read data, and maintain a state to indicate whether the data is still needed, such as reading to the HTTP Request Header, the end of the
GET /index.html HTT
blocking i/ O As long as you continue to read can, but in nonblocking I/O, we must maintain this state, the next time must read ' P ', otherwise the HTTP protocol parsing error.
A: The solution is to maintain a state machine, in the resolution of the request header corresponding to a state machine, parsing the header body when also maintaining a state machine, zaver state machine when referring to Nginx in the resolution of the header when the implementation, I made some streamlining and design changes.
- How better to achieve the analysis of the header
A: There are many HTTP headers, there must be a lot of analytic functions, such as If-modified-since
the parsing head and the parsing Connection
head are called two different functions, so the design here should be a modular, easy to expand the design, Can make it easy for developers to modify and define parsing for different headers. Zaver implementation of the method of reference to Nginx, defined a struct array, where each of the struct is a key, and the corresponding function pointer hock, if resolved to the Headerkey = = key, the hock. Define the code as follows
zv_http_header_handle_t zv_http_headers_in[] = { {"Host", zv_http_process_ignore}, {"Connection", zv_http_process_connection}, {"If-Modified-Since", zv_http_process_if_modified_since}, ... {"", zv_http_process_ignore}};
A: Zaver all headers with linked lists, linked list implementation reference to the Linux kernel of the double-linked list implementation (List_head), it provides a common doubly linked list data structure, the code is well worth reading, I made the simplification and changes, the code here.
A: This has a lot of mature programs, such as Http_load, Webbench, AB and so on. I finally chose the webbench, the reason is simple, with fork to emulate the client, the code only hundreds of lines, the problem can be immediately according to the Webbench source code to locate exactly which operation to make the server hang. In addition, because of a problem mentioned later, I carefully read the source of the next webbench, and very recommended for C beginners to see, only hundreds of lines, but involves the command line parameter parsing, fork process, parent-child process with pipe communication, signal handler registration, Tips for building HTTP protocol headers, and so on, some programming tricks.
- With the Webbech test, the server hangs at the end of the test
A: Best of all, no matter how long the time runs, the amount of concurrency, are at the end of the last webbench, the server hung up, so I guess it must be this moment what happened "things."
Start debugging positioning error code, I use the way to hit log, the fact behind the facts prove here is not a good way, in the multi-threaded environment to look at the log way to locate the error is a more difficult thing. The last log output to locate the error in the socket to write the other request file, that is, the system call hangs, write hangs is not return-1? The only explanation is that the process has accepted a signal, and the signal has made the process hang. So with Strace re-test, in the Strace output log found the problem, the system in the write time to accept the sigpipe, the default signal handler is the termination process. Sigpipe because the other side has closed the socket, but the process is also written inside. So I guess Webbench will not wait for the return of the server data after the test time to close all the sockets directly. Holding this suspicion to see the source of Webbench, sure enough, Webbench set a timer, in the normal test time will read the data returned by the server, and normal close, and when the test time to directly close all sockets, The data returned by the server is not read, which causes the zaver to write data to a socket that has been closed by the other, and the system sends the sigpipe.
The solution is also very simple to set the Sigpipe signal handler to sig_ign, which means ignoring the signal.
Insufficient
At present, there are many improvements in zaver, such as:
The newly allocated memory is now in the way of malloc, and then into the memory pool.
Does not support dynamic content, and later on to consider increasing support for PHP
http/1.1 is more complex, currently only implemented a few main (keep-alive, browser cache) header parsing
Timeout expired for inactive connections not yet done
...
Summarize
This article describes the Zaver, a simple structure that supports high-concurrency HTTP servers.
The basic architecture is the event loop + non-blocking I/O + thread pool.
Zaver's code style refers to Nginx's style, so it's very readable. In addition, Zaver provides configuration file and command-line argument resolution, as well as a complete makefile and source code structure, and can also help any C beginner to get started on how a project is built.
Currently, my wiki is hosted by Zaver.
Resources
[1] Https://github.com/zyearn/zaver
[2] http://nginx.org/en/
[3] "Linux Multithreaded Server Programming"
[4] Http://www.martinbroadhurst.com/server-examples.html
[5] Http://berb.github.io/diploma-thesis/original/index.html
[6] rfc2616
[7] https://banu.com/blog/2/how-to-use-epoll-a-complete-example-in-c/
[8] Unix Network programming, Volume 1:the Sockets Networking API (3rd Edition)
How to write a Web server