Analysis of Storm Foundation framework
The main problem we want to prove in this article is: In topology we can specify the degree of parallelism of spout, Bolt, storm How to automatically publish spout, bolt to each server and control the CPU, disk and other resources of the service when submitting topology?
The relationship between worker, executor and task
Nimbus will be able to work as a worker called Worker-slot.
Nimbus is the control core of the whole cluster, which is responsible for the topology submission, operation status monitoring, load balancing and task redistribution, etc.
Nimbus assignments include the path where the topology code is located (Nimbus local), tasks, executors, and workers information.
The worker is uniquely determined by node + port.
Supervisor is responsible for the actual synchronization of worker operations. A supervisor is called a node. Synchronous worker refers to the task scheduling and distribution in response to Nimbus, and the establishment, dispatch and destruction of the worker.
It downloads topology code from Nimbus to local for task scheduling.
Task assignment information contains mapping information for task to worker task-> node + host, so the worker node can determine which remote machines to communicate with according to the information.
The state machine of the cluster:
Cluster state management
The state of a cluster is described by a Storm-cluster-state object.
It provides a number of functional interfaces, such as:
Zookeeper related basic operations, such as Create-node, Set-data, Remove-node, Get-children and so on.
Heartbeat interface, such as supervisor-heartbeat!, worker-heatbeat! and so on.
Heartbeat information, such as Executors-beats.
Start, update, stop storm, such as update-storm!.
As shown in the following illustration:
The basis of task scheduling
Zookeeper is the core component of synchronization and coordination of the whole cluster state.
Supervisor, worker, executor and other components will regularly write heartbeat information to zookeeper.
Topologies information synchronizes to zookeeper when a topology error occurs, or when a new topology is committed to the cluster.
Nimbus regularly monitors task assignment information assignments on zookeeper and synchronizes the reassigned schedule to zookeeper.
Therefore, Nimbus will reassign tasks based on heartbeat, topologies information, and assigned task information, as shown in the following illustration:
Timing of Task Scheduling
As shown in the state machine diagram above, rebalance and do-reblalance (for example, from a Web call) trigger the Mk-assignments task (re) allocation.
At the same time, after the Nimbus process is started, periodic mk-assignments calls are made to load balancing and task assignment.
The client submits the topology via the Storm jar ... topology, which invokes the Nimbus's submit function through the thrift interface, which starts the storm and triggers the mk-assignments call.
Topology Submission Process
A topology submission process:
In non-local mode, the client invokes the Nimbus interface via thrift to upload the code to the Nimbus and trigger the commit operation.
Nimbus the task assignments and synchronizes the information to the zookeeper.
Supervisor regularly gets task assignment information, and if topology code is missing, it downloads code from Nimbus and assigns information based on tasks, synchronizing worker.
Workers start multiple executor threads based on assigned tasks information, simultaneously instantiate spout, Bolt, Acker, and so on, waiting for all connections (worker and other machine communication network connections) to start. This storm-cluster is in the working state.
Components such as spout, Bolt will run until the kill topology is displayed.
The main process is shown in the following illustration:
The relationship between worker, executor and task in storm
Clarify the relationship between the worker, executor, task, supervisor, Nimbus, ZK
First look at a picture
First of all, from the microscopic point of view: Worker is a process, a worker is a process, the process contains one or more threads, a thread is a executor, a thread will handle one or more tasks, a task is a mission, A task is an instance object of a node class.
A worker handles a subset of topology, the same subset can be handled simultaneously by multiple worker, one worker has and only one topology service, there is no one worker that handles topology1 several nodes, It also handles several nodes of topology2; a executor processes a node, but the node may have multiple instance objects, so configure the concurrency and Setnumtask to allocate a executor how many tasks to handle at the same time. By default, a executor handles a task. If you are working with multiple task,executor, you iterate through the execution task.
So what's the use of a excutor to handle multiple tasks? One understanding is that it is convenient to enlarge later. First of all, you know, once the topology code is submitted to the Nimbus, the task number will follow, never change, or even restart the topology, will not change the task number, unless you change the code, and then resubmit. and set the degree of parallelism is not the same, we do not need to resubmit the code, you can modify the topology concurrency, you can modify at any time. But a executor has to deal with a task, if previously we default to have 4 executor,4 task, that is, a executor processing a task, OK, I now feel that the concurrency is not enough, the processing speed can not keep up, want to increase some concurrent, tune to 8, hehe, But the task quantity is only 4, the extra executor is also just idle, therefore the high increase concurrent also did not have the egg to use. Like there are 4 apples, there are 4 people, a person to eat an apple for 5 minutes, now need in 5 seconds to eat the apple, the rule is an apple can only be eaten by one person. Now a person eat one, concurrent for 4, need 5 minutes, obviously can not meet, so you turn high concurrent, called 8 people, because an apple can only be eaten by one person, so the other 4 is not stare? and waste resources. So in order to facilitate the subsequent number of concurrent, or to set the task number.
Then take a look at the macro-storm architecture, to clarify the entire structure, only to see the concept of boring, rather than to see a topology from the process of submission to the operation of the whole to relax:
A topology submission process:
In non-local mode, the client invokes the Nimbus interface via thrift to upload the code to the Nimbus and trigger the commit operation.
Nimbus the task assignments and synchronizes the information to the zookeeper.
Supervisor regularly gets task assignment information, and if topology code is missing, it downloads code from Nimbus and assigns information based on tasks, synchronizing worker.
Workers start multiple executor threads based on assigned tasks information, simultaneously instantiate spout, Bolt, Acker, and so on, waiting for all connections (worker and other machine communication network connections) to start. This storm-cluster is in the working state.
Components such as spout, Bolt will run until the kill topology is displayed.
Nimbus is the control core of the whole cluster, which is responsible for the topology submission, operation status monitoring, load balancing and task redistribution, etc.
ZK is a manager, a monitor.
In a word: Nimbus command (Assignment), ZK supervised execution (heartbeat monitor, worker, supurvisor heartbeat all belong to it tube), supervisor lead (download code), recruit men (create worker and thread, etc.), worker, Executor, just work for me! In fact, frankly speaking with our common military management is a truth ah.
Here is only a superficial analysis of the relationship between several, has not talked about the load balance and task scheduling, not deep into the code level, will be supplemented later. If there is a mistake welcome to criticize correct!