Java invokes shell commands and scripts dedicated to Hadoop/spark clusters

Source: Internet
Author: User

Objective

Note that this blog post is based on the following blog posts, based on them, dedicated to my big data field!

http://kongcodecenter.iteye.com/blog/1231177

http://blog.csdn.net/u010376788/article/details/51337312

http://blog.csdn.net/arkblue/article/details/7897396

The first type: common practice

first , the numbering is written Wordcount.scala the program.

   then , hit the jar package, named Wc.jar. For example, I am here to export to the Windows desktop.

   next , upload to the Linux desktop and then move to the HDFs/directory.

   finally , under the bin of the Spark installation directory, perform the

Spark-submit \
>--class cn.spark.study.core.WordCount \
>--master local[1] \
>/home/spark/desktop/wc.jar \
> Hdfs://sparksinglenode:9000/spark.txt \
> Hdfs://sparksinglenode:9000/wcout

second type: high-level practices

Sometimes when we run Java programs on Linux, we need to invoke some shell commands and scripts. The Runtime.getruntime (). Exec () method provides us with this functionality, and Runtime.getruntime () provides us with the following exec () methods:

Not much to say, to enter directly.

  Step one : For the sake of specification, the name is Javashellutil.java. Well, write it down locally.

Import Java.io.BufferedReader;
Import java.io.IOException;
Import Java.io.InputStream;
Import Java.io.InputStreamReader;
Import java.util.ArrayList;
Import java.util.List;


public class Javashellutil {
public static void Main (string[] args) throws Exception {

String cmd= "Hdfs://sparksinglenode:9000/spark.txt";
InputStream in = null;

try {
Process Pro =runtime.getruntime (). EXEC ("sh/home/spark/test.sh" +cmd);
Pro.waitfor ();
in = Pro.getinputstream ();
BufferedReader read = new BufferedReader (new InputStreamReader (in));
String result = Read.readline ();
System.out.println ("INFO:" +result);
} catch (Exception e) {
E.printstacktrace ();
}
}
}

Package Cn.spark.study.core
Import org.apache.spark.SparkConf
Import Org.apache.spark.SparkContext

/**
* @author Administrator
*/
Object WordCount {

def main (args:array[string]) {
if (Args.length < 2) {
println ("argument must at least 2")
System.exit (1)
}
Val conf = new sparkconf ()
. Setappname ("WordCount")
. Setmaster ("local");//local is not a distributed file, i.e. under Windows and Linux
Val sc = new Sparkcontext (conf)

Val Inputpath=args (0)
Val Outputpath=args (1)

Val lines = Sc.textfile (InputPath, 1)
Val words = lines.flatmap {line = Line.split ("")}
Val pairs = Words.map {Word = = (Word, 1)}
Val Wordcounts = Pairs.reducebykey {_ + _}
Wordcounts.collect (). foreach (println)
Wordcounts.repartition (1). Saveastextfile (OutputPath)
}
}

Step Two : Write a good test.sh script

[Email protected]:~$ cat test.sh
#!/bin/sh
/usr/local/spark/spark-1.5.2-bin-hadoop2.6/bin/spark-submit \
--class cn.spark.study.core.WordCount \
--master local[1] \
/home/spark/desktop/wc.jar \
$ hdfs://sparksinglenode:9000/wcout

  Step three : Upload Javashellutil.java, and pack the Wc.jar

[Email protected]:~$ pwd
/home/spark
[Email protected]:~$ ls
Desktop Downloads Pictures Templates Videos
Documents Music Public test.sh
[Email protected]:~$ cd desktop/
[Email protected]:~/desktop$ ls
Javashellutil.java Wc.jar
[Email protected]:~/desktop$ javac Javashellutil.java
[Email protected]:~/desktop$ java javashellutil
INFO: (hadoop,1)
[Email protected]:~/desktop$ cd/usr/local/hadoop/hadoop-2.6.0/

  Step four : View the results of the output

[Email protected]:/usr/local/hadoop/hadoop-2.6.0$ bin/hadoop fs-cat/wcout/par*
(hadoop,1)
(hello,5)
(storm,1)
(spark,1)
(hive,1)
(hbase,1)
[Email protected]:/usr/local/hadoop/hadoop-2.6.0$

Success!

About

Shell Pass Parameters

See

Http://www.runoob.com/linux/linux-shell-passing-arguments.html

The last word is, not limited to this, can be interspersed in the future of our production business. As a call it can, very practical!

Java invokes shell commands and scripts dedicated to Hadoop/spark clusters

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.