Building a Twitter Storm Cluster

Source: Internet
Author: User

What is Storm?

Storm is an open-source and distributed real-time Computing System of Twitter.

Use Cases:

Real-time data analysis, continuous computing, and distributed RPC.

Storm features: (Storm is similar to a hand-held elevator, and will continue to run without failure. Hadoop is similar to a lift and will stop to a certain extent ).

  • Distributed
  • Scalable
  • High reliability
  • Simple programming model
  • Efficient Real-Time

Common classes:

  • BaseRichSpout (Message producer)
  • BaseBasicBolt (message processor)
  • TopologyBuilder (topology builder)
  • Config)
  • StormSubmitter/LocalCluster (topology submitter)

Storm cluster deployment

Storm cluster architecture:

:

Storm clusters have two types of nodes: control nodes and work nodes.

The control node runs a Nimbus process. Nimbus distributes code, assigns computing tasks, and monitors the status in the cluster.

Each worker node runs a Supervisor process.

The Supervisor is responsible for listening to the tasks assigned to it from Nimbus to start or stop the worker processes that execute the tasks accordingly.

All coordination between Nimbus and Supervisor is completed through the Zookeeper cluster.

Cluster Planning: (based on specific requirements)

Linux host name Storm role Zookeeper

Master Nimubus single-node zk

Slave01 Supervisor

Slave02 Supervisor

Preparations:

Environment: CentOS 6.4

Software:

Jzmq-master

Storm-0.8.2

Zeromq-2.1.7

Zookeeper-3.4.5

Environment Configuration:

Basic linux Configuration:

Modify host name

Modify IP Address

Modify the ing between hosts and IP addresses

Disable Firewall

Installation steps:

1. Install jdk

2. Build a Zookeeper cluster (Here we only install one zk on the master node)

Extract

Go to the conf directory of zk, cp zoo_sample.cfg zoo. cfg (Change the name)

Others remain unchanged for the moment

3. install Storm dependencies (zeromq, jzmq, and python)

3.1 install zeromq and go to the zeromq-2.1.7/directory

Check environment:./configure

Cd zeromq-2.1.7

./Configure

# Compilation may fail:

Configure: error: Unable to find a working C ++ compiler

# Install the dependent rpm package: libstdc ++-devel gcc-c ++

When a VM can access the Internet: (this method is recommended)

Yum install gcc-c ++

The VM cannot access the Internet:

First

Http://mirrors.163.com/centos/6.4/ OS /x86_64/Packages/

(The downloaded version must correspond to the system version)

Rpm-I libstdc ++ devel-4.4.7-3.el6.x86_64.rpm

Rpm-I gcc-c ++-4.4.7-3. el6.x86 _ 64.rpm

Rpm-I libuuid-devel-2.17.2-12.9.el6.x86_64.rpm

Then run./configure

Make (Compilation)

Make install (this is the complete installation)

3. 2. Compile and install JZMQ:

Cd jzmq

Run./autogen. sh

(To generate a configuration file, there is no configuration file by default)

# Error: autogen. sh: error: cocould not find libtool.

Libtool is required to run autogen. sh.

Libtool missing

Similarly, when Internet access is available

Yum install libtool (readhat Enterprise Edition does not report these errors)

Or manually install

Rpm-I autoconf-2.63-5.1.el6.noarch.rpm

Rpm-I automake-1.11.1-4.el6.noarch.rpm

Rpm-I libtool-2.2.6-15.5.el6.x86_64.rpm

./Configure

Make

Make install

3. 33. Compile and install Python

(First determine the version that comes with your system. If it is 2.6.6 or later, you do not need to install it)

Tar-zxvf Python-2.6.6.tgz

Cd Python-2.6.6

./Configure

Make

Make install

3.4 install storm

Modify the storm. yaml configuration file (the sub-node must also be modified)

Modify the host name corresponding to zk

Modify the Host Name of the master node

PS:

3.41.Storm release has a decompressed directory

Conf/storm. yaml file:

Used to configure Storm. The default configuration can be viewed here.

In conf/storm. yaml

The configuration option overwrites the default configuration in defaults. yaml.

The following configuration options must be in

Conf/storm. yaml:

Storm. zookeeper. servers:

The Zookeeper cluster address used by the Storm cluster,

The format is as follows:

Storm. zookeeper. servers:

-"111.222.333.444"

-"555.666.777.888"

If the Zookeeper cluster is not using the default port,

Storm. zookeeper. port is also required.

3.42storm.local.dir: Nimbus and Supervisor Processes

Used to store a small number of States,

Local disk directories such as jars and confs,

You need to create the directory in advance and grant sufficient access permissions.

Configure the directory in storm. yaml, for example:

Storm. local. dir: "/usr/storm/workdir"

Start three machines respectively

Master: to the bin directory of storm

./Storm nimbus>/dev/null 2> & 1 &

Slave01: to the bin directory of storm

./Storm supervisor> ../logs/su. log 2> & 1 &

Slave02: to the bin directory of storm

./Storm supervisor> ../logs/su. log 2> & 1 &

(Start the background process and output the correct and wrong information to the file)

Start the UI management interface on the master

./Storm ui>/dev/null 2> & 1 &

Observe through the browser: (master node ip: 8080)

Observe the cluster's worker resource usage,

The running status of Topologies.

So far, the Storm cluster has been deployed and configured. You can submit the topology to the cluster for running.

Recommended reading:

Twitter Storm installation configuration (cluster) Notes

Install a Twitter Storm Cluster

Notes on installing and configuring Twitter Storm (standalone version)

Storm practice and Example 1

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.