Apache Spark Source code reading 9 -- Spark Source code compilation

Source: Internet
Author: User

You are welcome to reprint it. Please indicate the source, huichiro.

Summary

There is nothing to say about source code compilation. For Java projects, as long as Maven or ant simple commands are clicked, they will be OK. However, when it comes to spark, it seems that things are not so simple. According to the spark officical document, there will always be compilation errors in one way or another, which is annoying.

Today, I had nothing to worry about, and I tried again. I had to make a record for later use.

Preparation

The Linux installed on my compilation machine is archlinux, and the following software is installed:

  1. Scala 1, 2.11
  2. Maven
  3. Git
Download source code

The first step is to download the source code on GitHub.

git clone https://github.com/apache/spark.git
Source code compilation

Instead of using Maven directly or using SBT directly, you can use the compiled script in spark.Make-distribution.sh

export SCALA_HOME=/usr/share/scalacd $SPARK_HOME./make-distribution.sh

If everything goes well, the target file is generated in the $ spark_home/ASSEMBLY/target/scala-2.10 directory, such

assembly/target/scala-2.10/spark-assembly-1.0.0-SNAPSHOT-hadoop1.0.4.jar
Compile with SBT

The main reason why SBT compilation fails is that some jar files cannot be accessed due to gFW. The solution is to add a proxy.

There are several ways to add a proxy. Which of the following is useful? Try it one by one for the latest spark. Run the following command.

export http_proxy=http://proxy-server:port

Method 2: SetJava_opts

JAVA_OPTS="-Dhttp.proxyServer=proxy-server -Dhttp.proxyPort=portNumber"
Run Test Cases

Since the JAR file can be compiled smoothly, you must also modify the two lines of code to try the effect. If you know that your launch has not taken effect, it is the best way to run the test case.

Assume that some source code under $ spark_home/core has been modified and re-compiled, run the following command:

export SCALA_HOME=/usr/share/scalamvn package -DskipTests

If you want to run the test case set randomsamplersuite in the $ spark_home/Core directory, run the following command.

export SPARK_LOCAL_IP=127.0.0.1export SPARK_MASTER_IP=127.0.0.1mvn -Dsuites=org.apache.spark.util.random.RandomSamplerSuite test

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.