Download
Download the Storm-yarn source from GitHub
Https://github.com/yahoo/storm-yarn
compiling
Prerequisites to install JDK and maven, unzip Storm-yarn-master.zip, and modify storm and Hadoop versions in Pom.xml
<properties> <storm.version>0.9.0</storm.version> <hadoop.version>2.5.0-cdh5.3.0</hadoop.version></properties>
Note: It is important to note that by Https://clojars.org/repo/storm this repository found that the two dependent libraries of Storm-core and Storm-netty were released only to 0.9.0.1, so the storm dropped to 0.9.0.
Because I'm using Cloudera's Hadoop here, I'm adding Maven's repository
<repository> <id>cdh.repo</id> <url>https://repository.cloudera.com/artifactory/cloudera-repos</url> <name>Cloudera Repositories</name></repository>
Using MAVEN to compile the source code
$ mvn clean package -DskipTests
Deployment
Decompression Storm.zip, this storm.zip is in the Storm-yarn-master/lib directory, in fact, this storm.zip is just a soft connection, pointing to the Storm-0.9.0-wip21.zip
$ unzip storm.zip
Configuring the bin of Storm-yarn-master and storm-0.9.0-wip21 into the environment variables of the system
# STORM_YARN_HOMEexport STORM_YARN_HOME=/home/hadoop/compile/storm-yarn-masterexport PATH=$PATH:$STORM_YARN_HOME/bin:$STORM_YARN_HOME/lib/storm-0.9.0-wip21/bin
Remember to make the configured environment variable effective
source /etc/profile
Add the additional jar packages required by storm engineering to Storm-0.9.0-wip21 Lib, such as the driver jar package for MySQL, to re-compress the Storm.zip file and upload it to the specified directory in HDFs (very important, in the cluster by accessing the Storm.zip in HDFs Get work environment), go to "/home/hadoop/compile/storm-yarn-master/lib" directory
-r storm.zip storm-0.9.0-wip21$ hdfs dfs -mkdir -p /lib/storm/0.9.0-wip21/$ hdfs dfs -put storm.zip /lib/storm/0.9.0-wip21/
Run
Modify the Storm.yaml in the "/opt/modules/storm-yarn-master/lib/storm-0.9.0-wip21/conf" directory, where only the zookeeper address is modified:
storm.zookeeper.servers: - "hadoop-yarn01.dimensoft.com.cn" - "hadoop-yarn02.dimensoft.com.cn" - "hadoop-yarn03.dimensoft.com.cn"
Submit Storm to Yarn
$ storm-yarn launch conf/storm.yaml
Through Yarn interface view, job execution error:
15/10/1316:22:Auth ERROR. Thriftserver:thriftserver is being stopped due to:org. Apache. thrift7. Transport. Ttransportexception:could not create serversocket on address0.0.0.0/0.0.0.0:9000.Org.apache.thrift7.transport.TTransportException:Could not create serversocket on address0.0.0.0/0.0.0.0:9000. at org. Apache. thrift7. Transport. Tnonblockingserversocket.<init> (Tnonblockingserversocket. Java:() at Org. Apache. thrift7. Transport. Tnonblockingserversocket.<init> (Tnonblockingserversocket. Java:() at Org. Apache. thrift7. Transport. Tnonblockingserversocket.<init> (Tnonblockingserversocket. Java:) at Backtype. Storm. Security.auth.getserver (simpletransportplugin .java:47) at Backtype.security.auth .serve (Thriftserver.java:52) at com.yahoo.storm.yarn.main (Masterserver.java:175)
This is because the node that deploys Storm-yarn runs the CM server and CM server db, causing the 9000 port number to be occupied, and Storm Applicationmaster initializing it, it will be started in the same container. Nimbus and Storm Web UI two services, then request resources according to the number of supervisor to be started, in the current implementation, Applicationmaster will request all resources on one node and then start the Supervisor service, that is, the current supervisor will monopolize the node without sharing the node resources with other services, in which case other services may be prevented from interfering with the storm cluster. In addition to running Storm Nimbus and the Web UI, Storm Applicationmaster also launches a thrift server to handle various requests from the Yarn-storm client side, and does not repeat this. Add the configuration modification thrift port number directly in the Storm.yaml.
master.thrift.port: 9002
Submit again successfully, view yarn Web UI
See which node the job is running directly from yarn's Web UI, which is the UI node for the storm cluster, such as: 192.168.100.154 nodes, then the Storm UI is
http://192.168.100.154:7070
command that can be used
storm-yarn [command] –appId [appId] –output [file] [–supervisors [n]]
Where the command is specific, see the table below, the parameter "-appid" is the application ID of the storm that is launched, "-supervisors" is the number of supervisor services to be added, which is only valid for command "Addsupervisors"
Storm on Yarn Deployment