Spark, SPARKR Deployment

Source: Internet
Author: User

1. Configuring the Java Environment

TAR-ZXVF jdk-8u77-linux-x64.tar.gz-c/opt/java/

Vi/etc/profile

Export Java_home=/opt/java/jdk1.8.0_77export Jre_home=${java_home}/jreexport classpath=.:{java_home}/lib:${jre_ Home}/libexport Path=${java_home}/bin: $PATH

Source/etc/profile

2. Installation R

We want to be able to write Spark programs in the R language and install the R interpreter locally

Add a data source (this machine is ubuntu12.04 selected trusty)

Deb Http://mirror.bjtu.edu.cn/cran/bin/linux/ubuntu trusty/

sudo apt-get install R-base-core=3.1.3-1trusty

3. Install Hadoop

wget http://apache.claz.org/hadoop/common/hadoop-2.6.0/hadoop-2.6.0.tar.gz

After the download is complete, modify the configuration file, edit the /etc/profile ,hadoop-env.sh file,core-site.xml, hdfs-site.xml,mapred-site.xml,yarn-site.xml After the modification is complete, initialize, Start

./bin/hdfs Namenode-format

./sbin/start-all.sh

JPs

4. Install Scala

Unzip the installation and configure the environment variables, the installation can be completed to view the version information

5. Install Spark

Set environment variables after decompression

    Export spark_home=/opt/spark-1.4.1-bin-hadoop2.6    export path= $SPARK _home/bin: $PATH

Inconfcopy and rename under directoryspark-env.sh.templateto bespark-env.shafter adding

Export Java_home=your JAVA homeexport scala_home=your SCALA homeexport spark_master_ip=tmasterexport SPARK_WORKER_ memory=4g

Start and test:

./sbin/start-all.sh

./bin/run-example SPARKPI


6. Start sparkr :

./bin/sparkr


7, Sparkr The number of rows statistics

Lines<-sparkr:::textfile (SC, "readme.md")

Count (lines)







Spark, SPARKR Deployment

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.