Learn about the process & thread stack default 10M

Source: Internet
Author: User
Tags epoll

Read this article first: http://blog.csdn.net/qq910894904/article/details/41699541

The event-driven asynchronous server that is represented by Nginx is sweeping the world, so will the event-driven model be the end of the server-side model?
We can take a closer look at the model of event-driven programming.
The architecture of event-driven programming is to pre-design an event loop that constantly checks the information currently being processed and runs a trigger function based on the information to be processed. This external information may come from a file in a folder, possibly from a keyboard or mouse action, or a time event. This trigger function can be either the system default or a user-registered callback function.

Event-driven programming focuses on resiliency and asynchrony. Many GUI frameworks (such as MFC for Windows, GUI framework for Android), zookeeper watcher, etc., use event-driven mechanisms. There will be other event-driven works in the future.

Event-driven programming is a single-threaded thinking, which is characterized by asynchronous + callbacks.

The co-process is also single-threaded, but it allows non-human code to be written using asynchronous + callback methods, which can be written in a seemingly synchronous way. It is the key to the so-called non-preemptive collaboration to achieve push-pull interactions.

Summarize

Benefits of the co-process:

    • Cross-platform
    • Cross-architecture
    • No overhead for thread context switching
    • No need for atomic operation locking and synchronization overhead
    • Easy switching of control flow and simplified programming model
    • High concurrency + high scalability + Low cost: A CPU support for tens of thousands of processes is not a problem. Therefore, it is suitable for high concurrency processing.

Disadvantages:

    • Unable to take advantage of multicore resources: The nature of the process is a single thread, it can not be a single CPU at the same time multiple cores, the process needs and processes to run on multi-CPU. Of course, most of the applications that we write in the day-out are not necessary, except for CPU-intensive applications.
    • Blocking (Blocking) operations, such as IO, block the entire program: As with event-driven, asynchronous IO operations can be used to resolve

https://www.zhihu.com/question/32218874

The convenience of the process is much more objective because "the state is represented by code rather than the maintenance of a piece of data structure to represent the state ".

Tim shen Link: https://www.zhihu.com/question/32218874/answer/56255707Source: Copyright belongs to the author, please contact the author for authorization. Tell me something about history. The early Unix era advocated synchronous programming, when the design of the time did not think much of what performance is not performance, do it first. So the first person is to open the process and then use the pipe (pipe) connected together; how intuitive, I push things into the pipe, you go to the head of the next. Do not know which heaven and man note that open a thousand son 800 process machine on the card dead, this How to do, so we only engaged in a process, there is a piece of open network connection and file, with Select this system call to the IO event monitoring at the same time, who first came to me to deal with WHO. However, the performance is also not good, no eggs. People are looking for ways to improve performance on both paths. The process this way people again engaged in multi-threading, the result is not enough. Multi-threaded or eat kernel resources, resistant. Linux can't help it, make a epoll (but I don't know and BSD kqueue which in front) system call, is red black tree improved version of SELECT. This person happy, really fast ah, open thousand son 800 connect a trifle, at the same time to monitor the event, determine the ID of the connection and the type of event, and then hit (dispatch) into a different callback function. Do not know whether it is the market, just this time the internet began to fire, the better the performance of the more money, so certainly to this set of efficient interface, pipe He is not easy to use it, anyway, we are hard-working, plus shake M, not afraid of manual maintenance status. You open too many threads, and finally, you have to manually maintain the status. Developers who are unfamiliar with the operating system know just what to use and how to optimize multithreading. Now multithreaded optimized out, called the co-process, some of the scheduler (such as Go User-State scheduler) some do not (such as yield), depending on how to use. The memory overhead is still larger than the asynchronous callback (one for the stack, and one for an event loop for asynchronous callbacks), but now the memory is cheap, not a bottleneck. People are most concerned about is "must not block", to do everything possible to let the CPU turn up, so that concurrency can go up. And everywhere open the process (because than the thread cheaper AH) can achieve this effect, open 100,000 of the process memory is not exploded, run 24 can be full CPU. So this advantage of asynchronous callbacks is gone. 

I've noticed that people who are involved in application development and Web services development may not be so familiar with the bottom line, and sometimes it's a detour. Kqueue/epoll such a simple thing, by developers to get out of the flowers, asynchronous programming event callback a set of, write the book also have, write the library also have (recently that reactive also). The natural industry has a specialization, but if you spend a little bit of energy over the underlying concept, learning/inventing the upper things will be much quicker and less painful.

As for the trend, it is also a fashionable bar, these branches of the trick will not be a few hours of learning. If you are worried about spending a lot of energy to learn a "technology", but also afraid of learning useless, so there are"is not the trend ah" this question, it is probably the foundation is not strong, to lay the groundwork on the line.

I wrote C + + and I found that the only place I used Callback/closure/lambda was that I had a function, but I wanted to take a hole in the function and let the user fill the hole. It's not much to use.
Note: It took a lot of time to learn about the trend before you wanted to learn java. Now think, really is indecisive ah, learn to know the trend, haha.

------Another person's answer---------

Sometimes it is felt that gevent (a library of Python) such libraries is forced by the Python GIL, if the native thread support is good enough, the necessity of the co-process may not necessarily be very large.

The process comes first from a successful case in the field of high-performance computing, where collaborative scheduling can be exchanged for throughput at the expense of fairness compared to preemptive scheduling.

When the internet industry faces c10k problems, the threading scenario is not enough to carry a lot of concurrency, when thethe solution is a epoll ()-type event loop, nginx in this wave of tide smoothly replaced Apache superior. The development community at the same time was shocked by the performance of Nginx, and there were many application frameworks that used event loops, such as Tornado/nodejs, and were able to run higher scores. And the Python/ruby community was tired by the GIL, with little concurrent support, and the event loop was a kind of concurrent liberation.

However, the asynchronous control flow of the event loop is not friendly to the developer. The Mysql/memcache calls that are ubiquitous in the business code rapidly inflate into a lumpcallback Hell。 At this time the community discovered the process, in the user-State implementation of the context Switch tool, the Epoll () event loop hidden, and the cost is not high: with each process a user-state stack, instead of manual state management. It seems that the benefits of both the event loop and the thread synchronization control flow are obtained, both the high performance of the Epoll () and the ease of development. Even throughMonkey Patch, the old sync code can get asynchronous high performance almost seamlessly, it's perfect.

Note: Callback hell can see: http://www.infoq.com/cn/articles/nodejs-callback-hell/
When node. JS needs to execute asynchronous logic sequentially, it generally takes the following delivery style, which is to encapsulate the subsequent logic in the callback function as the parameter of the starting function, nesting by layer. Although this style can improve CPU utilization and reduce wait time, but when the subsequent logical steps are more, it will affect the readability of the code, the result code modification maintenance becomes very difficult. According to the appearance of this code, it is generally referred to as "callback Hell" or "Pyramid of Doom", this article called the callback Pit, more nesting, the deeper the pit.

Note: Monkey patch can be seen: http://www.tuicool.com/articles/2aIZZb

Monkey patch refers to dynamic substitution at run time, usually at startup. (feeling most of a trick of a Python program ) used gevent will know, At the very beginning of the Gevent.monkey.patch_all (); thread in the standard library/socket, etc. so that we can use the socket in the back when we use the same as usual, no need to modify any code, but it becomes non-blocking. Before doing a game server, many places with the import JSON, Later found that ujson than the self-brought JSON faster than the N times, so the question came, is it dozens of files to import JSON into an import Ujson as JSON??You just need to Monkey patch in the process startup. It affects the entire process space. A module in the same process space is only run once. The following is the code. main.pyImportJSONImportujsondef Monkey_patch_json (): json.__name__= ' Ujson 'Json.dumps=ujson.dumps json.loads=Ujson.loadsmonkey_patch_json () print' main.py ', json.__name__Importsubsub.pyImportJsonprint' sub.py ', json.__name__ run main.py, you can see the output' Ujson ', which shows that the JSON behind import is patch. Finally, be careful not to simply JSON= Ujson to replace.

However, after running a lap back, how much difference does the co-process have compared to native threads?

1. User-State stack to create "lightweight threads" in a lighter amount;
2. Cooperative user-State Scheduler, less thread context switching;
3. Re-implement synchronization primitives such as mutexes;

The creation cost of the co-process is smaller, but the creation cost can be completely bypassed by the thread pool, and the thread pool is more fine grained, whenthe advantage of the thread pool is more than the effort to develop the model, not the performance。 In addition, the name "lightweight threading" has a certain misleading component, the process as a user-state thread, the required context information is almost the same as the system thread. If the feature that blocks the system thread scale is memory (athe stack of system threads has almost 10MB of virtual memory, the number of threads is limited by the virtual address space, and if the stack of user-configured threads is used in moderation, the same amount of memory is required. Note: The default size of line stacks can be viewed through ulimit-s.
$ ulimit-s    10240

That is 10M; the method of modification is as follows:

Linux view modify thread default stack space size Ulimit-s1, by command Ulimit-s view the default stack space size of Linux, by default 10240 is 10M2, set size value by command ulimit-s temporary Change the size of the stack space: Ulimit-s 102400, that is, modified to 100M3, you can add ulimit-s 102400 in/etc/rc.local can be turned on to set the stack space size 4, in/etc/ security/limits.conf can also change the size of the stack space: #<domain> <type> <item> <value>* Soft Stack 102400  -S to see Change to 102400 or 100M
The advantage of collaborative scheduling compared to preemptive scheduling is that context switching is less expensive (but is the difference significant?). , it is easier to run the cache, but it also discards the priority concept of the native thread, and if there is a longer computation task, it will affect the response delay of the IO task. andThe Kernel Scheduler always prioritizes IO tasks to get them to respond as quickly as possible。 In addition, the single-threaded process scheme does not fundamentally prevent blocking, such as file operations, memory pages, which are factors that affect latency.

an advantage of the event loop scheme is that it avoids locking, and that locks are the root of all evils.。 The process scheme, based on the event loop scheme, seems to inherit the advantage of not locking. But it's not. Outside the bounds of context switching, there is no guarantee of a critical section. The locked place still needs to be locked. Differences exist, but the information that is maintained is not less. If the runtime's support for system threads is better, the overall benefit of the business system's use of the coprocessor is not necessarily greater than the thread pool. The usual "high concurrency" in our industry is often just to reacha few k QPS, however, a QPS is a metric that measures throughput rather than concurrency (concurrent 1k means that it responds to 1k connections at the same time, and the QPS measures how many requests a second, which can be queued and not necessarily "concurrent"), not by the thread pool. But for the Gil runtime like Python, this has the advantage of significantly improved performance, but at this point the bottleneck is in the Gil, not the thread.

As for the concurrent volume-oriented business, generally is also less state context of the business, such as push, then callback hell is basically controllable, using the association of events compared to the event loop is still easier to program, but the benefits are not significant.

Finally, try to summarize your personal thoughts:
The process is not a trend, it is a useful complement to the existing problems that have been excavated in history.

The applicable scenario:
    • High-performance computing, at the expense of fairness in exchange for throughput;
    • For IO Bound tasks, reduce the idle of Io waiting, which is in fact consistent with the advantages in the field of high performance computing;
    • Generator flow-type calculation;
    • Eliminate Callback Hell, using a synchronization model to reduce development costs while preserving the benefits of more flexible control flows, such as sending three requests at the same time, saving the use of stacks, can fully play the "light" advantage;
But it's not a panacea:
    • If the stack is used unchecked, the amount of memory consumed is the same as the system thread, and even the memory management is not as good as the system thread (the system thread can dynamically adjust the virtual memory, the user thread's segmented stack scheme has a serious jitter problem, and the continous stack scheme is not properly managed also jitter, In order to avoid jitter is the space to change the time, and the kernel in this area to do how much heuristic it);
    • IO Bound tasks can be mitigated to some extent by thread pool size, the goal is to run the CPU full, this thread pool performance may not be perfect, but in the field of business logic is passed;
    • In addition, the general Python/ruby task is not strict IO Bound, such as ORM object creation, template rendering, GC and even the interpreter itself, are CPU large, a single request to take the Redis request and database request time, whether the other time is still a lot?
    • long computations on the CPU, which results in a poor scheduling of user threads and a faster response time, can be longer than the average for a single request (admittedly concurrency may be higher); However, this is not a disadvantage in the Gil language of Python, or even better than the Gil's dispatch, at least Gevent can know the priority of each IO task, and the GIL dispatch is the de facto FIFO;

References

-Xiao-feng Li:thread mapping:1:1 vs m:n
-http://Thread.gmane.org/gmane. comp.lang.rust.devel/6479

Learn about the process & thread stack default 10M

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.