"Go + fix" is installed locally under windows and Rstudio Sparkr

Last Update:2016-06-02 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

(Revised according to the latest situation)

Without a doubt, spark has become the hottest big data tool, this article details the way to install SPARKR, allowing you to use it locally within 5 minutes.

? Environmental Requirements : Java 7+, R, and Rstudio
Rtools (: https://cran.r-project.org/bin/windows/Rtools/)

first step: Download Spark

? In the browser open http://spark.apache.org/, click on the Right green button "Downloadspark"

You will see the following page:

? Follow the 1 to 3 above to create the download link.
In the "2. Choose a packagetype option, select a pre-built type (for example).

Because we are going to run locally under Windows, choose Pre-built Package Forhadoop 2.6 and later.

in the "3. Choose a download Type "select Direct Download".

Once selected, a download link is available at 4. Downloadspark " created well. ?

Download the zip file to your computer.

Step Two: Unzip the installation file

Uncompress to Path "C:/apache/spark-1.4.1″

? Step three: Run with the command line (this step needs to be configured to complete r and other environment variables before it takes effect, skip this step if you don't need a command-line window)

Open the Command Line window (Start-Search box, enter cmd) to change the path:

Input command".\bin\sparkR"

?成功后会看到一些日志，大约15s后，一切顺利的话，会有"Welcometo sparkr!"

Set Environment variables:

Right-click on "My Computer" and select "Properties":

? Select "Advanced System Settings"

Click "Environment Variables" to find the path in the "Systemvariables" below and add "C:\ProgramData\Oracle\Java\javapath;"

Fourth step: Running in Rstudio?

" c:/apache/spark-1.6.1 " ). Libpaths (C (File.path (sys.getenv("spark_home""R")  "lib"),. libpaths ()))

#注意把spark -1.6.1 directory under the R directory Sparkr into the library of R, or can not directly install SPARKR package?

The library address for R can be viewed in the following ways:
. Libpaths () The new Lib library is installed by default in the first address (default address)

#load the Sparkr librarylibrary (SPARKR) # Create A spark context and a SQL contextsc<-Sparkr.init (master ="Local") SqlContext<-Sparkrsql.init (SC) #create a sparkr dataframedf<-Createdataframe (SqlContext, faithful) head (DF) # Create A simple local data.framelocaldf<-Data.frame (Name=c ("John","Smith","Sarah"), Age=c ( +, at, -) # Convert Local data frame to a Sparkr DATAFRAMEDF<-createdataframe (SqlContext, Localdf) # Print its schemaprintschema (DF) # root#|--Name:string(Nullable =true)# |--Age:Double(Nullable =true) # Create a DataFrame froma JSON filepath<-File.path (Sys.getenv ("Spark_home"),"Examples/src/main/resources/people.json") Peopledf<-jsonfile (sqlcontext, Path) Printschema (PEOPLEDF) # Register ThisDataFrame asa table.registertemptable (PEOPLEDF,"people") # SQL statements can be run byusingThe SQL methods provided by Sqlcontextteenagers<-SQL (SqlContext,"SELECT name from people WHERE age >= and <=") # Call Collect toGeta local data.frameteenagerslocaldf<-Collect (Teenagers) # Print the teenagersinchOur datasetprint (teenagerslocaldf) # Stop the Sparkcontext nowsparkr.stop ()

?

 # Another example wordcount--------------# source http:  / / www.cnblogs.com/hseagle/p/3998853.html  sc  <-sparkr.init (master= local  , "   Rwordcount   ) lines  <- Textfile (SC,  readme.md   "

—————— the "textfile" function cannot be used after the sparkR1.4, SPARKR must load the data through SqlContext, as follows:
People <- read.df (sqlcontext,"./examples/src/main/resources/people.json" " JSON " )
In addition, CSV, parquet hive data, and so on are supported.

words<-FlatMap (lines, function (line) {Strsplit (line," ")[[1]]}) WordCount<-lapply (words, function (word) {list (word),1L)}) counts<-Reducebykey (WordCount,"+",2L) Output<-Collect (counts) for(WordCountinchoutput) {Cat (wordcount[[1]],": ", wordcount[[2]],"\ n")}

? Original address: http://www.r-bloggers.com/installing-and-starting-sparkr-locally-on-windows-os-and-rstudio/

Resources

1. Installing http://blog.csdn.net/jediael_lu/article/details/45310321

2. Installing http://thinkerou.com/2015-05/How-to-Build-Spark-on-Windows/

3. Emblem Shanghai Lang's blog: http://www.cnblogs.com/hseagle/p/3998853.html

4. Learning http://www.r-bloggers.com/a-first-look-at-spark/?

5. Learning http://www.danielemaasit.com/getting-started-with-sparkr/

6.?? Error Resolution: http://stackoverflow.com/questions/10077689/ R-cmd-on-windows-7-error-r-is-not-recognized-as-an-internal-or-external-comm

7.SparkR Official Guide Http://spark.apache.org/docs/latest/sparkr.html#from-local-data-frames (Chinese version: http://www.iteblog.com/ archives/1385)

"Go + fix" is installed locally under windows and Rstudio Sparkr

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

"Go + fix" is installed locally under windows and Rstudio Sparkr

Contact Us

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support