International - English

Cart Console

Topic Center

Contact Sales

Home > Others

[Spark Asia Pacific Research Institute Series] the path to spark practice-Chapter 1 building a spark cluster (Step 3) (1)

Last Update:2014-09-04 Source: Internet

Author: User

Tags scp command

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Step 1: software required by the spark cluster;

Build a spark cluster on the basis of the hadoop cluster built from scratch in Articles 1 and 2. We will use the spark 1.0.0 version released in May 30, 2014, that is, the latest version of spark, to build a spark Cluster Based on spark 1.0.0, the required software is as follows:

1. Spark 1.0.0, I use the spark-1.0.0-bin-hadoop1.tgz here, the specific is the http://d3kbcqa49mib13.cloudfront.net/spark-1.0.0-bin-hadoop1.tgz

As shown in:

The author stores the data on the master node, as shown in figure:

2. Download the scala version corresponding to spark 1.0.0. The official requirement is that Scala must be Scala 2.10.x:

The author downloaded "Scala 2.10.4", the specific official download for the http://www.scala-lang.org/download/2.10.4.html after saving on the master node:

Step 2: Install each software

Install Scala

Open the terminal and create a new directory "/usr/lib/Scala", as shown in:

2. decompress the scala file, as shown in:

Move the extracted Scala to the created "/usr/lib/Scala", as shown in

3. Modify environment variables:

Go to the configuration file as shown in:

Press "I" to enter the insert mode and add the scala environment compiling information, as shown in:

From the configuration file, we can see that we have set "scala_home" and set the scala bin directory to path.

Press the "ESC" key to return to normal mode, save and exit the configuration file:

Run the following command to modify the configuration file:

4. display the installed Scala version on the terminal, as shown in

We found that the version is "2.10.4", which is what we expect.

When we enter the "Scala" command, we can directly enter the scala command line interactive interface:

In this case, enter the expression "9*9:

At this point, we found that Scala correctly helped us calculate the results.

Now we have installed Scala on the master;

Because spark is running on the master, slave1, and slave2 machines, we need to install the same Scala on slave1 and slave2. Use the SCP command to install the scala directory and /. Bashrc "is copied to the same directory of slave1 and slave2. Of course, you can also manually install slave1 and slave2 on the master node.

After Scala is installed on slave1, the test results are as follows:

After Scala is installed on slave2, the test results are as follows:

So far, scala has been successfully deployed on the master, slave1, and slave2 machines.

[Spark Asia Pacific Research Institute Series] the path to spark practice-Chapter 1 building a spark cluster (Step 3) (1)

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

Related Keywords:

alternatives to apache spark spark connect to kafka spark redis spark pr spark group coursera spark databricks spark

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

What's Trending

Top 10 Tags

datastax versions naming convention zookeeper client class definition md5 microsoft sql server 2005 data structures exception handling error handling

Top 10 Keywords

microsoft download center down wordpress address url site address url wordpress address url windows installer 4 0 download 302 not found web address url definition site address url wordpress db2 integer mac os installation step by step pdf abbreviation for return

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

[Spark Asia Pacific Research Institute Series] the path to spark practice-Chapter 1 building a spark cluster (Step 3) (1)

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support