Storm on Yarn installation Configuration

Source: Internet
Author: User
Tags hadoop fs

1. Background Knowledge

Without modifying any source code of storm, let Storm run on yarn. The simplest implementation method is to integrate various storm service components (including nimbus and supervisor ), as a separate task running on yarn, the current famous "Storm on yarn" is implemented by Yahoo! Open-source, which basically implements the functions described above, the following details:
(1) Yarn-storm Client
A series of shell commands are provided for you to control storm services on yarn. For example, the command for building a storm cluster is as follows:
Storm-yarn launch <storm-yarn-config>
<Storm-yarn-config> indicates the storm configuration information, including the number of supervisor started and the memory occupied by storm applicationmaster.
After storm is started, you can use the following command to control STORM:
Storm-yarn [command]-appid [appid]-output [file] [-supervisors [N]

Command

Parameter description

Setstormconfig

Reset cluster configuration and restart Cluster

Getstormconfig

Obtain the current cluster configuration in JSON format.

Addsupervisors

Increase the number of supervisor

Startnimbus/stopnimbus

Start and Stop Nimbus

Startui/stopui

Start and Stop web UI

Startsupervisors/stopsupervisors

Start and Stop all Supervisor

Shutdown

Disable a cluster

(2) Yarn-storm applicationmaster

When Storm applicationmaster is initialized, the storm nimbus and storm web UI services will be started in the same iner, and resources will be requested from ResourceManager based on the number of supervisors to be started. In the current implementation, applicationmaster requests all resources on a node and starts the supervisor service. That is to say, the current supervisor exclusively occupies the node and does not share node resources with other services, in this case, other services can be prevented from interfering with storm clusters.
In addition to running storm nimbus and web UI, storm applicationmaster also starts a thrift server to process various requests from the yarn-storm client.

 

2. installation environment

A. hadoop 2.2.0
B. jdk1.7.0 _ 60
C. apache-maven-3.0.5

 

3. Storm on Yarn installation preparation

Note: storm must be installed on all nodes. Only one storm on Yarn client can be installed.

A. download storm on Yarn from GitHub
Wget https://github.com/yahoo/storm-yarn/archive/master.zip
B. Storm on Yarn needs to be compiled

Unzip storm-yarn-master.zip

CD storm-yarn-Master

C. Edit Pom. XML, modify the hadoop version number, and change it to the corresponding version number.

<properties>        <storm.version>0.9.0-wip21</storm.version>        

D. MVN Compilation

MVN package-dskiptests

Decompress storm-yarn-master/lib/storm-0.9.0-wip21.zip after compilation to get the storm-0.9.0-wip21 directory.
Move the obtained storm-0.9.0-wip21 directory to the same level as storm-yarn-master.

/Home/ebupt/storm/
| -- Storm-0.9.0-wip21
'-- Storm-yarn-Master


4. Configure the storm Working Environment

A. Add storm-0.9.0-wip21 and storm-yarn-master bin to Path Environment Variable
VI ~ /. Bash_profile

export STORM_HOME=$HOME/stormexport PATH=$PATH:$STORM_HOME/storm-yarn-master/bin:$STORM_HOME/storm-0.9.0-wip21/bin

B. Add additional jar packages (storm-0.9.0-wip21/lib/storm.zip) required for the storm project ). Upload to the HDFS definition directory (very important, the cluster obtains the work environment through storm.zip in HDFS)

CD storm-0.9.0-wip21/lib/
Hadoop FS-put storm.zip/lib/storm/0.9.0-wip21/

 

5. Install and run storm

A. Modify the storm. yaml File
VI storm-0.9.0-wip21/CONF/storm. yaml

storm.zookeeper.servers:- "eb170"- "eb171"storm.zookeeper.port: 2182storm.local.dir: "/home/ebupt/storm/localstorm"supervisor.slots.ports:- 6700- 6701- 6702- 6703- 6704

B. Submit and run storm on Yarn and obtain an applicationid.
Storm-yarn launch ~ # Storm/storm-0.9.0-wip21/CONF/storm. yaml

SLF4J: Class path contains multiple SLF4J bindings.SLF4J: Found binding in [jar:file:/home/ebupt/eb/hadoop-2.2.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]SLF4J: Found binding in [jar:file:/home/ebupt/eb/storm-yarn/storm-0.9.0-wip21/lib/logback-classic-1.0.6.jar!/org/slf4j/impl/StaticLoggerBinder.class]SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]14/07/04 15:37:45 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable14/07/04 15:37:45 INFO client.RMProxy: Connecting to ResourceManager at eb170/10.1.69.170:803214/07/04 15:37:46 INFO yarn.StormOnYarn: Copy App Master jar from local filesystem and add to local environment14/07/04 15:37:47 INFO yarn.StormOnYarn: Set the environment for the application master14/07/04 15:37:47 INFO yarn.StormOnYarn: YARN CLASSPATH COMMAND = [[yarn, classpath]]14/07/04 15:37:47 INFO yarn.StormOnYarn: YARN CLASSPATH = [/home/ebupt/eb/hadoop-2.2.0/etc/hadoop:/home/ebupt/eb/hadoop-2.2.0/etc/hadoop:/home/ebupt/eb/hadoop-2.2.0/etc/hadoop:/home/ebupt/eb/hadoop-2.2.0/share/hadoop/common/lib/*:/home/ebupt/eb/hadoop-2.2.0/share/hadoop/common/*:/home/ebupt/eb/hadoop-2.2.0/share/hadoop/hdfs:/home/ebupt/eb/hadoop-2.2.0/share/hadoop/hdfs/lib/*:/home/ebupt/eb/hadoop-2.2.0/share/hadoop/hdfs/*:/home/ebupt/eb/hadoop-2.2.0/share/hadoop/yarn/lib/*:/home/ebupt/eb/hadoop-2.2.0/share/hadoop/yarn/*:/home/ebupt/eb/hadoop-2.2.0/share/hadoop/mapreduce/lib/*:/home/ebupt/eb/hadoop-2.2.0/share/hadoop/mapreduce/*:/home/ebupt/hadoop/contrib/capacity-scheduler/*.jar:/home/ebupt/eb/hadoop-2.2.0/share/hadoop/yarn/*:/home/ebupt/eb/hadoop-2.2.0/share/hadoop/yarn/lib/*]14/07/04 15:37:47 INFO yarn.StormOnYarn: Using JAVA_HOME = [/home/ebupt/eb/jdk1.7.0_60]14/07/04 15:37:47 INFO yarn.StormOnYarn: Setting up app master command:[/home/ebupt/eb/jdk1.7.0_60/bin/java, -Dstorm.home=./storm/storm-0.9.0-wip21/, -Dlogfile.name=<LOG_DIR>/master.log, com.yahoo.storm.yarn.MasterServer, 1><LOG_DIR>/stdout, 2><LOG_DIR>/stderr]14/07/04 15:37:47 INFO impl.YarnClientImpl: Submitted application application_1402648970753_0025 to ResourceManager at eb170/10.1.69.170:8032application_1402648970753_0025

Note: Because storm runs on the cluster as a yarn program, an appid is generated, as shown in application_1402648970753_0025.

 

6. Storm submit the task

A. Obtain cluster configuration

Storm-yarn getstormconfig-appid application_1402648970753_0025-output ~ /Storm. yaml
B. Run the following command to obtain the nimbus host
Cat ~ /Storm. yaml | grep nimbus. Host

C. Submit Topology
Storm jar lib/storm-starter-0.0.1-SNAPSHOT.jar storm. starter. wordcounttopology-C nimbus. Host = <your nimbus host>
D. Monitoring Topology
View the storm UI at: http: // <your nimbus host>: 7070


E. Disable topology.
Storm kill [topology_name]
F. Disable storm on Yarn Cluster
Storm-yarn shutdown-appid [applicationid]

G. view the Storm Process status: Nimbus, supervisor, core, logviewer, and worker.

[[email protected] ~]$ jps8700 JournalNode8939 NodeManager8805 DFSZKFailoverController31802 worker8501 NameNode5189 Jps31616 supervisor8592 DataNode28865 logviewer31793 worker31475 MasterServer31795 worker5841 HRegionServer31509 nimbus31510 core31577 QuorumPeerMain

 

7. other mountains and stones

A. Sometimes it is found that the supervisor cannot be started and the memory resources are insufficient. Attention should be paid to this in the virtual machine environment.

B. nimbus. HOST: After you submit storm to yarn, yarn will assign you an address and you have to find it by yourself.

 

8. References

Https://github.com/yahoo/storm-yarn

Http://dongxicheng.org/mapreduce-nextgen/storm-on-yarn/

Http://blog.csdn.net/jiushuai/article/details/26693311

Http://ghost-face.iteye.com/blog/2017374

 

9. Problems encountered and Solutions

① Storm cluster cannot be loaded

[[email protected] conf]$ storm-yarn launch storm.yamlSLF4J: Class path contains multiple SLF4J bindings.SLF4J: Found binding in [jar:file:/home/ebupt/eb/hadoop-2.2.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]SLF4J: Found binding in [jar:file:/home/ebupt/eb/storm-yarn/storm-0.9.0-wip21/lib/logback-classic-1.0.6.jar!/org/slf4j/impl/StaticLoggerBinder.class]SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]Exception in thread "main" java.lang.UnsupportedClassVersionError: backtype/storm/utils/Utils : Unsupported major.minor version 51.0        at java.lang.ClassLoader.defineClass1(Native Method)        at java.lang.ClassLoader.defineClassCond(ClassLoader.java:631)        at java.lang.ClassLoader.defineClass(ClassLoader.java:615)        at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:141)        at java.net.URLClassLoader.defineClass(URLClassLoader.java:283)        at java.net.URLClassLoader.access$000(URLClassLoader.java:58)        at java.net.URLClassLoader$1.run(URLClassLoader.java:197)        at java.security.AccessController.doPrivileged(Native Method)        at java.net.URLClassLoader.findClass(URLClassLoader.java:190)        at java.lang.ClassLoader.loadClass(ClassLoader.java:306)        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)        at java.lang.ClassLoader.loadClass(ClassLoader.java:247)        at com.yahoo.storm.yarn.Config.readStormConfig(Config.java:48)        at com.yahoo.storm.yarn.LaunchCommand.process(LaunchCommand.java:59)        at com.yahoo.storm.yarn.Client.execute(Client.java:142)        at com.yahoo.storm.yarn.Client.main(Client.java:148)

Cause: the Java version is incorrect. (J2se 7 = 51) the Java environment in the lab environment is jdk1.6, and storm on Yarn requires jdk1.7.

Solution: Upgrade JDK to version 1.7.

 

② On yarn, storm on yarn is submitted, and the task is fail. The log is as follows.

User:    huangqName:    Storm-on-YarnApplication Type:    YARNState:    FAILEDFinalStatus:    FAILEDStarted:    4-Jul-2014 10:14:15Elapsed:    4 secTracking URL:    HistoryDiagnostics:    Application application_1402648970753_0013 failed 2 times due to AM Container for appattempt_1402648970753_0013_000002 exited with exitCode: 126 due to: Exception from container-launch:org.apache.hadoop.util.Shell$ExitCodeException:at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)at org.apache.hadoop.util.Shell.run(Shell.java:379)at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283)at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79)at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)at java.util.concurrent.FutureTask.run(FutureTask.java:138)at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)at java.lang.Thread.run(Thread.java:662).Failing this attempt.. Failing the application.

Cause: If the JDK version of the running machine that yarn assigns to storm on yarn is not 1.7, no error is returned after the JDK version is modified.

Solution: The JDK versions of all the clusters deployed in yarn must be unified into jdk1.7.

 

③ After the storm cluster is loaded, storm-yarn launch ~ /Storm/storm-0.9.0-wip21/CONF/storm. yaml, the supervisor process cannot be seen on the storm UI.

Cause: After storm is started, nimbus and core processes are loaded and the supervisor process cannot be loaded.

This is because "the supervisor will exclusively occupy the node and will not share node resources with other services", just as the lab hadoop2.0 test cluster has only two machines, and the supervisor has no exclusive node, resulting in failure to start.

Solution: expand the number of cluster nodes. Another measure to test the compromise: manually start the supervisor: Storm Supervisor &; the disadvantage is that you need to manually manage the supervisor process and kill and release resources on your own.

 

④ During MVN compilation, the error message "the domain name cannot be resolved and the POM download fails. It can be compiled on maven 240, and this problem occurs on the cluster 170. Not solved yet.

Cause: It was originally suspected that there was a problem with the domain name, and the domain name of the maven repository had not been resolved. The second attempt was to copy the local repository of 240 to the machine of 170, but still not resolved on the machine of 170.

Solution: You have time to thoroughly study the maven tool.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.