I. Description
Storm is a distributed real-time computing system, and Storm's significance for real-time computing is equivalent to the meaning of Hadoop for batch computing. For a system with high real-time storm is a good choice. Hadoop provides a map, reduce primitive, which makes the batch process very simple and graceful. In the same way, Storm provides some simple and graceful primitives for real-time computing.
Description of the terminology involved:
The master node of the Nimbus:storm cluster is responsible for distributing code within the storm cluster, assigning tasks to the working machine, and monitoring the running state of the cluster. Its process is named Nimbus.
The supervisor:supervisor is responsible for monitoring the tasks assigned to it from Nimbus to start or stop the worker process that executes the task. Its process is named supervisor.
The Core:storm UI service process.
To prepare for installation before deployment:
1. Configure each host IP, configure each host IP as static IP (ensure that each host can communicate properly, in order to avoid excessive network transmission, it is recommended that the same network segment).
2. Modify each host name and all hosts in the storm cluster need to be modified.
3. Configure each host mapping, modify the Hosts file, and add the mappings for each host IP and hostname.
4. Open the appropriate port and the ports that are configured in the following document need to be open (or shut down the firewall).
5. Python2.7 and above version.
6. Ensure that the Zookeeper Cluster service is running properly. If you have Hadoop or zookeeper installed on CentOS, there are 1-5 basic questions. About Zookeeper reference: http://www.cnblogs.com/wxisme/p/5178211.html.
7. The JDK and storm versions used here are 1.8 and 0.9, respectively. 5.
Two. Installing the configuration Storm cluster
1. Download the corresponding installation package to the Storm website and upload it to the cluster node.
2. Unzip the installation package
tar -xvzf apache-storm-0.9. 5. tar. gz
3. Modify the Storm.yaml configuration file
Vim Conf/storm.yaml
The zookeeper cluster address used by the storm cluster is modified according to the actual situation.
storm.zookeeper.servers: " Node1 " " Node2 " " Node3 "
The zookeeper port is not the default port when configuration parameters are required:
" Modified Port "
The Nimbus and supervisor processes are used to store a small number of States, such as the local disk directory for jars, Confs, and so on, which need to be created in advance and given sufficient access rights.
Storm.local. dir " /usr/storm/data "
Storm cluster Nimbus Machine address, each supervisor work node need to know which machine is Nimbus, in order to download topologies jars, Confs and other files. Modify according to the actual situation.
" Node3 "
For each Supervisor worker node, you need to configure the number of workers that the worker node can run. Each worker occupies a separate port for receiving messages, and the configuration option is used to define which ports are available for use by the worker. By default, 4 workers can be run on each node, at 6700, 6701, 6702, and 6703 ports, respectively. Modify according to the actual situation.
supervisor.slots.ports: 6700 6701 6702
DRPC provides access to the processing functions in the cluster . Storm cluster DRPC address, modified according to the actual situation. About DRCP reference: http://www.dataguru.cn/article-5572-1.html
drpc.servers: " Node3 "
By default, when Storm starts the worker process, the JVM's maximum memory is 768M. Due to the large amount of data loaded in the bolt during use, 768M memory does not meet the requirements and can cause memory overflow. Modify according to the actual situation.
" -xmx1024m "
Note: It is best not to leave blank lines or other whitespace characters between the above configurations. three. Start the Storm cluster
1. Start the Nimbus service at the main control node
Bin/storm Nimbus >>/dev/null &
To see if the Nimbus service is started:
JPs
2. Start the Supervisor service on each node
Bin/storm Supervisor >>/dev/null &
3. Start the DRPC service
Bin/storm drpc >>/dev/null &
4. Start the Storm UI service at the master node
Bin/storm UI >>/dev/null &
To see if the UI service is started:
JPs
Accessing the Storm UI
http://nimbus:8080/
Four. Submitting services to the Storm cluster
Execute the following command to start Storm topology:
Bin/storm jar Test.jar com.test.MyTopology arg1 arg2
Where Test.jar is the jar package that contains the topology implementation code, the main method of Com.test.MyTopology is the entrance to the topology, and Arg1 and arg2 require the arguments to be passed when the com.test.MyTopology is executed.
To stop storm topology:
Kill {Toponame}
where {toponame} is the name of the topology task specified when topology commits to the storm cluster.
A simple storm cluster is deployed and you can start a pleasant storm journey!
Storm Cluster deployment