What is Storm?
Storm is an open-source and distributed real-time Computing System of Twitter.
Use Cases:
Real-time data analysis, continuous computing, and distributed RPC.
Storm features: (Storm is similar to a hand-held elevator, and will continue to run without failure. Hadoop is similar to a lift and will stop to a certain extent ).
- Distributed
- Scalable
- High reliability
- Simple programming model
- Efficient Real-Time
Common classes:
- BaseRichSpout (Message producer)
- BaseBasicBolt (message processor)
- TopologyBuilder (topology builder)
- Config)
- StormSubmitter/LocalCluster (topology submitter)
Storm cluster deployment
Storm cluster architecture:
:
Storm clusters have two types of nodes: control nodes and work nodes.
The control node runs a Nimbus process. Nimbus distributes code, assigns computing tasks, and monitors the status in the cluster.
Each worker node runs a Supervisor process.
The Supervisor is responsible for listening to the tasks assigned to it from Nimbus to start or stop the worker processes that execute the tasks accordingly.
All coordination between Nimbus and Supervisor is completed through the Zookeeper cluster.
Cluster Planning: (based on specific requirements)
Linux host name Storm role Zookeeper
Master Nimubus single-node zk
Slave01 Supervisor
Slave02 Supervisor
Preparations:
Environment: CentOS 6.4
Software:
Jzmq-master
Storm-0.8.2
Zeromq-2.1.7
Zookeeper-3.4.5
Environment Configuration:
Basic linux Configuration:
Modify host name
Modify IP Address
Modify the ing between hosts and IP addresses
Disable Firewall
Installation steps:
1. Install jdk
2. Build a Zookeeper cluster (Here we only install one zk on the master node)
Extract
Go to the conf directory of zk, cp zoo_sample.cfg zoo. cfg (Change the name)
Others remain unchanged for the moment
3. install Storm dependencies (zeromq, jzmq, and python)
3.1 install zeromq and go to the zeromq-2.1.7/directory
Check environment:./configure
Cd zeromq-2.1.7
./Configure
# Compilation may fail:
Configure: error: Unable to find a working C ++ compiler
# Install the dependent rpm package: libstdc ++-devel gcc-c ++
When a VM can access the Internet: (this method is recommended)
Yum install gcc-c ++
The VM cannot access the Internet:
First
Http://mirrors.163.com/centos/6.4/ OS /x86_64/Packages/
(The downloaded version must correspond to the system version)
Rpm-I libstdc ++ devel-4.4.7-3.el6.x86_64.rpm
Rpm-I gcc-c ++-4.4.7-3. el6.x86 _ 64.rpm
Rpm-I libuuid-devel-2.17.2-12.9.el6.x86_64.rpm
Then run./configure
Make (Compilation)
Make install (this is the complete installation)
3. 2. Compile and install JZMQ:
Cd jzmq
Run./autogen. sh
(To generate a configuration file, there is no configuration file by default)
# Error: autogen. sh: error: cocould not find libtool.
Libtool is required to run autogen. sh.
Libtool missing
Similarly, when Internet access is available
Yum install libtool (readhat Enterprise Edition does not report these errors)
Or manually install
Rpm-I autoconf-2.63-5.1.el6.noarch.rpm
Rpm-I automake-1.11.1-4.el6.noarch.rpm
Rpm-I libtool-2.2.6-15.5.el6.x86_64.rpm
./Configure
Make
Make install
3. 33. Compile and install Python
(First determine the version that comes with your system. If it is 2.6.6 or later, you do not need to install it)
Tar-zxvf Python-2.6.6.tgz
Cd Python-2.6.6
./Configure
Make
Make install
3.4 install storm
Modify the storm. yaml configuration file (the sub-node must also be modified)
Modify the host name corresponding to zk
Modify the Host Name of the master node
PS:
3.41.Storm release has a decompressed directory
Conf/storm. yaml file:
Used to configure Storm. The default configuration can be viewed here.
In conf/storm. yaml
The configuration option overwrites the default configuration in defaults. yaml.
The following configuration options must be in
Conf/storm. yaml:
Storm. zookeeper. servers:
The Zookeeper cluster address used by the Storm cluster,
The format is as follows:
Storm. zookeeper. servers:
-"111.222.333.444"
-"555.666.777.888"
If the Zookeeper cluster is not using the default port,
Storm. zookeeper. port is also required.
3.42storm.local.dir: Nimbus and Supervisor Processes
Used to store a small number of States,
Local disk directories such as jars and confs,
You need to create the directory in advance and grant sufficient access permissions.
Configure the directory in storm. yaml, for example:
Storm. local. dir: "/usr/storm/workdir"
Start three machines respectively
Master: to the bin directory of storm
./Storm nimbus>/dev/null 2> & 1 &
Slave01: to the bin directory of storm
./Storm supervisor> ../logs/su. log 2> & 1 &
Slave02: to the bin directory of storm
./Storm supervisor> ../logs/su. log 2> & 1 &
(Start the background process and output the correct and wrong information to the file)
Start the UI management interface on the master
./Storm ui>/dev/null 2> & 1 &
Observe through the browser: (master node ip: 8080)
Observe the cluster's worker resource usage,
The running status of Topologies.
So far, the Storm cluster has been deployed and configured. You can submit the topology to the cluster for running.
Recommended reading:
Twitter Storm installation configuration (cluster) Notes
Install a Twitter Storm Cluster
Notes on installing and configuring Twitter Storm (standalone version)
Storm practice and Example 1