International - English

Cart Console

Topic Center

Contact Sales

Home > Developer Tools > Technical Articles

Deploy Spark onto Hadoop 2.2.0

Last Update:2014-12-22 Source: Internet

Author: User

Keywords Compile or step directly

Tags aliyun apache basic basic software compile download hadoop hadoop 2

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

This article describes how to deploy Apache to Hadoop 2.2.0, http://www.aliyun.com/zixun/aggregation/14417.html". If your Hadoop is another version, such as CDH4, you can refer directly to the official Explain the operation.

There are two points to note: (1) Hadoop must be 2.0 series, such as 0.23.x, 2.0.x, 2.xx or CDH4, CDH5, etc. Running Spark on Hadoop is essentially running Spark on Hadoop YARN Because Spark itself provides only job management capabilities, resource scheduling relies on third-party systems such as YARN or Mesos (2) The reason why Mesos is not YARN is because YARN has strong community support and has gradually Become a resource management system standard.

Note that the official has released the 0.8.1 version, you can directly select from here to download the appropriate version, if you are using hadoop 2.2.0 or CDH5, can be downloaded directly from here.

Deploying Spark to Hadoop 2.2.0 requires the following steps:

Step 1: Prepare the basic software

Step 2: Download Compile spark 0.8.1 or later

Step 3: Run the Spark instance

The next few details of these steps.

Step 1: Prepare the basic software

(1) basic software

Including the Linux operating system, Hadoop 2.2.0 or higher, Maven 3.0.4 (or the latest 3.0.x version), which, Hadoop 2.2.0 only need to use the simplest way to install, specific reference to my This article: Hadoop YARN installation and deployment, Maven installation method is very simple, you can download the binary version at http://maven.apache.org/download.cgi, extract, configure MAVEN_HOME and PATH two environment variables, specific to their own Find relevant methods on the Internet, such as this "Linux installation maven", but need to be aware that the version is not 3.0.x, Spark version of the stringent requirements.

(2) hardware preparation

Spark 2.2.0 came up with a yarn-new support hadoop 2.2.0, because hadoop 2.2.0 API incompatible changes need to use Maven compiled and packaged separately, and the compilation process is very slow (normal machine, 2 hours About), and take up more memory, so you need a machine that meets the following conditions as a compiler:

Condition 1: Can be networked: The first compilation, maven need to download a lot of jar package from the Internet, the speed is relatively slow, if your network does not work, it is recommended to give up compile.

Condition 2: Memory above 2GB

Step 2: Download Compile spark 0.8.1 or later

Git can be downloaded or directly wget or spark 0.8.1 version

wget https://github.com/apache/incubator-spark/archive/v0.8.1-incubating.zip

Note that version 0.8.1 did not support hadoop 2.2.0, since version 0.8.1.

After downloading, unzip it:

unzip v0.8.1-incubating

Then enter the extract directory, enter the following command:

cd incubator-spark-0.8.1-incubating

export MAVEN_OPTS = "- Xmx2g -XX: MaxPermSize = 512M -XX: ReservedCodeCacheSize = 512m"

mvn -Dyarn.version = 2.2.0 -Dhadoop.version = 2.2.0 -Pnew-yarn -DskipTests package

Generally need to wait a long time, after the completion of the compiler, the spark kernel packaged into a separate jar package, the command is as follows:

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

Related Keywords:

apache spark vs hadoop spark storm hadoop apache spark hadoop hadoop aws spark what is apache spark vs hadoop how much faster is apache spark than hadoop deploy

What is SFTP Commands Linux_the Introduction 01-20

How to Configure CentOS 7.4 SFTP Server 01-19

Build an SFTP Server Using CentOS Built-in SSH Service 01-17

Configure Linux SFTP and Configure User Access 01-16

How to Easily Configure SFTP Server Linux In 6 Steps 01-15

Automatic Upload and Download of SFTP Files_Shell Script 01-14

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

Hot Article

Hot Tags

computing conference access forum computer class data get http html applications

Popular Keywords

direct digital landing development documentation data user director of marketing deploy it ddos how to description of products and services ddos information data website domain to dns

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Deploy Spark onto Hadoop 2.2.0

Contact Us

Hot Article

Hot Tags

Popular Keywords

Recommend Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support