Apache Storm cluster Environment setup

Source: Internet
Author: User
Tags message queue zookeeper

Apache Storm is an open-source big data processing system by Twitter, and unlike other systems, Storm is designed for distributed real-time processing and language-independent. The author's knowledge of storm usage scenarios such as real-time log analysis, real-time analysis of site user behavior, real-time computing, and so on, now many companies are also using storm as part of their big data architecture to achieve some real-time business processing.

I believe everyone and I have the same knowledge, that is the current technology is the project-driven model, no best technology, only the most suitable for their own project technology. Let me share with you some simple insights into storm:

Benefits of Storm :

1. The simple programming model, similar to MapReduce, reduces the complexity of parallel batching, and storm reduces the complexity of real-time processing.

2. Support a variety of languages, you can use Java, Ruby, Clojure and Python on storm, if you want to add a new language support, you just need to implement a simple storm communication protocol.

3. Fault tolerance, Storm manages worker processes and node failures.

4. Horizontal expansion, calculation is carried out in parallel between multiple threads, processes and servers.

5. Reliable message processing, storm ensures that each message is processed at least once, and when the task fails, storm attempts to re-launch it from the source.

6. Fast, systematic design ensures that messages can be processed quickly, and Storm uses the ZEROMQ as its underlying message queue.

7. Local mode, which is undoubtedly a lot of convenience for developers, can be developed and tested in a local simulation cluster environment.

Storm's current problems:

1. The current open source version only a single node of the Nimbus, hanging off after only automatic restart, there is a certain risk to the system, you can consider the two-node Nimbus layout.

Part of the 2.Storm core code is written by Clojure, Clojure is a dynamic functional programming language running on the JVM, the advantage lies in the process calculation, although the performance of a lot more but also improve the maintenance costs.

The introduction said a lot, the following simple to build a storm cluster environment:

Prep environment: At least three Linux servers (the author uses 5 cloud server for Linux Redhat edition)

Cluster Construction:

First step: Install Jdk/jre

Step two: To install zookeeper, you can refer to my other blog post:

http://bigcat2013.iteye.com/blog/2175538

Step three: Download Apache storm:http://apache.arvixe.com/storm/

The previous project used 0.9.1 version, now because of the need to use Kafka, so chose the latest version 0.9.3

Fourth step: Upload the downloaded compressed package to the server (can pass WINSCP, etc.)

Fifth step: Unzip the package with "tar-xzvf apache-storm-0.9.3.tar.gz"

The directory structure after decompression:


Sixth step: Modify the Storm configuration file (/conf/storm.yaml)

Basic need to configure Storm.zookeeper.servers, Nimbus.host, Storm.local.dir, Ui.port, supervisor.slots.ports several properties, It is important to note that Nimbus does not need to configure the Supervisor.slots.ports property, Supervisor does not need to configure the Ui.port property, because Nimbus is the primary node, there is no UI worker,supervisor is a working node, and no UI is only worker:

Common

Nimbus


Supervisor (number of worker per supervisor can be adjusted by increasing or decreasing the number of slots):



Note: The configuration information do not shelf write, or the startup will be error when the property value is not found.

Seventh step: Copy the configured storm directory to the other server via "scp-r" (Note: If your current server is configured as Nimbus, Other servers are configured with the Supervisor configuration method described above Storm.yaml)




Eighth step: Create a local directory of storm on the clustered server (corresponding to the Storm.local.dir configuration in Storm.yaml)

Nineth Step: Start the Zookeeper cluster and start the Storm cluster

Start storm Nimbus: sudo nohup./bin/storm nimbus >/dev/null &

Start the Storm UI:sudo nohup./bin/storm UI >/dev/null &
Start Supervisor: sudo nohup./bin/storm Supervisor >/dev/null &

By configuring the property values of the Nimbus address +ui.port, you can access the Storm's UI and monitor the running state of the storm.

Following the simple configuration above, the storm cluster can be set up, readers can make additional configuration according to their own project needs, in addition can be configured in/logback/cluster.xml log4j, log output and automatic cleanup rules ~

      • This article is from: Linux Tutorial Network

Apache Storm cluster Environment setup

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.