Objective
Note that this blog post is based on the following blog posts, based on them, dedicated to my big data field!
http://kongcodecenter.iteye.com/blog/1231177
http://blog.csdn.net/u010376788/article/details/51337312
http://blog.csdn.net/arkblue/article/details/7897396
The first type: common practice
first , the numbering is written Wordcount.scala the program.
then , hit the jar package, named Wc.jar. For example, I am here to export to the Windows desktop.
next , upload to the Linux desktop and then move to the HDFs/directory.
finally , under the bin of the Spark installation directory, perform the
Spark-submit \
>--class cn.spark.study.core.WordCount \
>--master local[1] \
>/home/spark/desktop/wc.jar \
> Hdfs://sparksinglenode:9000/spark.txt \
> Hdfs://sparksinglenode:9000/wcout
second type: high-level practices
Sometimes when we run Java programs on Linux, we need to invoke some shell commands and scripts. The Runtime.getruntime (). Exec () method provides us with this functionality, and Runtime.getruntime () provides us with the following exec () methods:
Not much to say, to enter directly.
Step one : For the sake of specification, the name is Javashellutil.java. Well, write it down locally.
Import Java.io.BufferedReader;
Import java.io.IOException;
Import Java.io.InputStream;
Import Java.io.InputStreamReader;
Import java.util.ArrayList;
Import java.util.List;
public class Javashellutil {
public static void Main (string[] args) throws Exception {
String cmd= "Hdfs://sparksinglenode:9000/spark.txt";
InputStream in = null;
try {
Process Pro =runtime.getruntime (). EXEC ("sh/home/spark/test.sh" +cmd);
Pro.waitfor ();
in = Pro.getinputstream ();
BufferedReader read = new BufferedReader (new InputStreamReader (in));
String result = Read.readline ();
System.out.println ("INFO:" +result);
} catch (Exception e) {
E.printstacktrace ();
}
}
}
Package Cn.spark.study.core
Import org.apache.spark.SparkConf
Import Org.apache.spark.SparkContext
/**
* @author Administrator
*/
Object WordCount {
def main (args:array[string]) {
if (Args.length < 2) {
println ("argument must at least 2")
System.exit (1)
}
Val conf = new sparkconf ()
. Setappname ("WordCount")
. Setmaster ("local");//local is not a distributed file, i.e. under Windows and Linux
Val sc = new Sparkcontext (conf)
Val Inputpath=args (0)
Val Outputpath=args (1)
Val lines = Sc.textfile (InputPath, 1)
Val words = lines.flatmap {line = Line.split ("")}
Val pairs = Words.map {Word = = (Word, 1)}
Val Wordcounts = Pairs.reducebykey {_ + _}
Wordcounts.collect (). foreach (println)
Wordcounts.repartition (1). Saveastextfile (OutputPath)
}
}
Step Two : Write a good test.sh script
[Email protected]:~$ cat test.sh
#!/bin/sh
/usr/local/spark/spark-1.5.2-bin-hadoop2.6/bin/spark-submit \
--class cn.spark.study.core.WordCount \
--master local[1] \
/home/spark/desktop/wc.jar \
$ hdfs://sparksinglenode:9000/wcout
Step three : Upload Javashellutil.java, and pack the Wc.jar
[Email protected]:~$ pwd
/home/spark
[Email protected]:~$ ls
Desktop Downloads Pictures Templates Videos
Documents Music Public test.sh
[Email protected]:~$ cd desktop/
[Email protected]:~/desktop$ ls
Javashellutil.java Wc.jar
[Email protected]:~/desktop$ javac Javashellutil.java
[Email protected]:~/desktop$ java javashellutil
INFO: (hadoop,1)
[Email protected]:~/desktop$ cd/usr/local/hadoop/hadoop-2.6.0/
Step four : View the results of the output
[Email protected]:/usr/local/hadoop/hadoop-2.6.0$ bin/hadoop fs-cat/wcout/par*
(hadoop,1)
(hello,5)
(storm,1)
(spark,1)
(hive,1)
(hbase,1)
[Email protected]:/usr/local/hadoop/hadoop-2.6.0$
Success!
About
Shell Pass Parameters
See
Http://www.runoob.com/linux/linux-shell-passing-arguments.html
The last word is, not limited to this, can be interspersed in the future of our production business. As a call it can, very practical!
Java invokes shell commands and scripts dedicated to Hadoop/spark clusters