Xu Mingming Blog: The directory structure in the zookeeper of Twitter Storm source code Analysis
We know that all the state information of the Twitter storm is stored in the zookeeper, Nimbus by writing the status information on the zookeeper to assign the task, Supervisor,task to pick up the task by reading the status from the zookeeper, At the same time supervisor, the task also defines sending heartbeat information to zookeeper, allowing Nimbus to monitor the state of the entire storm cluster so that some of the suspended tasks can be restarted. ZooKeeper makes the entire storm cluster very robust-it doesn't matter if any one of the work machines hangs up, just restart and then retrieve the status information from the ZooKeeper.
This article mainly introduces the data directory structure that Twitter storm keeps in zookeeper , the source code is mainly: Backtype.storm.cluster.
One thing to be aware of is that the author uses storm-id
it in many places in the code, which is topology-id
actually the meaning. I asked him in the mailing list, and he said that before he called Topology Storm, the code didn't change.
Look directly at the following structure:
1/-{storm-zk-root}--The root of Storm on the zookeeper2|Catalogue3|4|-/assignments--task assignment information for topology5| |6| |-/{topology-id}--The following is a store of each7|Topology's Assignments8|information includes: the corresponding9|The code directory on the Nimbus, allTen|the start time of the task, One|mapping of each task to machine, Port A| -|-/tasks--all the task -| | the| |-/{topology-id}--This directory has the following ID -| | {topology-ID} of topology -| | All of the corresponding task-ID -| | +| |-/{task-id}--This file is stored in this -| Component-of the task correspondingID: +| It could be Spout-id or bolt-.ID A| at|-/storms--This directory saves all running -| |the ID of the topology -| | -| |-/{topology-id}--This file saves this topology -|Some of the information, including topology's -|name, when topology started running in|between the topology and the state of the -|(see Stormbase Class) to| +|-/supervisors--This directory holds all the supervisor -| |The Heartbeat information the| | *| |-/{supervisor-id}--This file is saved by Supervisor $|Heartbeat information includes: Heartbeat time, MainPanax Notoginseng|machine name, this supervisor on worker -|the port number run time the|(see Supervisorinfo Class) +| A|-/taskbeats--heartbeat of All Tasks the| | +| |-/{topology-id}--This directory preserves the topology of this -| |the heartbeat information for some task $| | $| |-/{task-id}--The heartbeat information of the task, including when the heartbeat -|, task run time, and some statistics -|Information the| -|-/taskerrors--error information generated by all TasksWuyi| the|-/{topology-id}--This directory is saved under this topology -|error message for each task Wu| -|-/{task-id}--Error message for this task
The directory structure in the zookeeper of Twitter Storm source code Analysis