Erlang: RabbitMQ source code analysis 3. In-depth analysis of supervisor and supervisor2, supervisor

Source: Internet
Author: User

Erlang: RabbitMQ source code analysis 3. In-depth analysis of supervisor and supervisor2, supervisor

Supervisor is also a common behavior in Erlang/OTP. It is used to build a supervisor tree for process monitoring and fault recovery.

RabbitMQ implements a supervisor2. We analyze the implementation and differences between the two from the source code perspective.


First, we will introduce some basic concepts of the supervisor. Assuming node_manager_sup is a supervisor, its init function will define some parameters of the supervisor and its children.

Parameters:

1. Restart Strategy:

Strategy must be one of simple_one_for_one, one_for_one, one_for_all, and rest_for_one.

Simple_one_for_one indicates that the supervisor does not start children when it starts. The number of children is unlimited, but only one type of child can be shared.

One_for_one indicates that the supervisor starts all children when it starts. Once a child crashes, the supervisor only restarts this child process without affecting other children.

One_for_all indicates that the supervisor starts all children at startup. Once a child crashes, the supervisor restarts all children processes.

One_for_rest indicates that the supervisor starts all children when it starts. Once a child crashes, the supervisor restarts all children declared after this child.


2. intensity and period: If the restart time exceeds intensity, the supervisor will kill all children together with himself.


Children parameters:

1. StartFun, child startup function, must return {OK, ChildPid} or {OK, ChildPid, Info}

2. Restart: the Restart policy of this child. permanent indicates that the system has been restarted, temporary indicates that the system has never restarted, and transient indicates that the system has restarted only when the error exits.

3. shutDown: the shutdown Policy of the child process. brutal_kill indicates that the kill is triggered immediately, and a positive integer indicates timeout. That is, a kill request is sent. If the timeout time has not received the response, the request is killed, infinity indicates that only one kill request is sent without forcing kill. It is generally used when child is also a supervisor.


Supervisor is also a gen_server

Supervisor2 does not use gen_server2, but uses the original gen_server


Since it is gen_server, the supervisor's entry start_link is actually the internal init function:

1. Check all parameters of the supervisor.

2. If it is not simple_one_for_one, start all children.


When there is a child process exit, according to the behavior of gen_server, handle_info ({'exit ', Pid, Reason}, State) will handle it, Restart according to the Restart type of this child.

The number of restarts will be updated before the restart. If you find that the restart time exceeds the intensity, kill all children together with yourself.

If the restart fails, the system restarts again. handle_cast processes the restart request.


Finally, let's look at several supervisor's export functions:

Start_child, which is a gen_server: call,

1. If it is simple_one_for_one, take a boot option from children. Because simple_one_for_one has the same children and is not started with the supervisor.

2. If it is not simple_one_for_one, start_child will input a child and start the child

Restart_child, delete_child, terminate_child, which_children, and which_children are also gen_server: call


Supervisor2:

Supervisor2 does not use RabbitMQ's own gen_server2 but uses gen_server. The reason is that the supervisor receives fewer requests (both restart, start, stop, and so on) and does not use hibernate.

Supervisor2 does not make many changes to the supervisor:

1.Intrinsic, The child Restart adds the intrinsic type, which is similar to the transient. The difference is that if the child does not exit normally, the transient deletes the child, and the supervisor runs normally. However, if the child exits abnormally, the supervisor also exits and deletes all other children.

2.Delay,As mentioned above, if the reboot exceeds intensity within the period, the supervisor will kill all children together with itself..In supervisor2, the child Restart can be written to {permanent, Delay }|{ transient, Delay }|{ intrinsic, Delay}, so that after restarting within the period time exceeds intensity, the supervisor does not kill all the tasks, but waits for the Delay time to restart the child.




Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.