Start by creating a new Maven project in Eclipse Java EE with the following specific options
Click Finish to create a success, then change the default jdk1.5 to jdk1.8
Then edit Pom.xml Join Spark-core Dependency
<!--Https://mvnrepository.com/artifact/org.apache.spark/spark-core--
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.11</artifactId>
<version>2.2.1</version>
</dependency>
Then copy the source code sample program in the book, because the spark version in the book is 1.2 My environment spark is 2.2.1 so need to modify the code to adapt to the new version of the spark API
javardd<string> words = Input.flatmap (
New flatmapfunction<string, string> () {
Public iterator<string> Call (String x) {
Return Arrays.aslist (X.split ("")). iterator ();
}});
Then execute maven install and then go to the directory E:\developtools\eclipse-jee-neon-3-win32\workspace\learning-spark-mini-example\ Target finds Learning-spark-mini-example-0.0.1-snapshot.jar and uploads the Linux directory to the spark2.2.1 environment
Then execute the following commands in Linux, such as
[Email protected] ~]# spark-submit \
>--class com.oreilly.learningsparkexamples.mini.java.WordCount \
> Learning-spark-mini-example-0.0.1-snapshot.jar \
>/opt/spark-2.2.1-bin-hadoop2.7/readme.md wordcounts
Spark execution example eclipse MAVEN package jar