1. You need to install the R,rstudio in Mac OS First, this is relatively simple, skim
2. Download the compiled spark (spark-2.0.0-bin-hadoop2.6.tgz) and download it to your desired version on the Spark website.
Unzip spark to the specified directory
$ TAR-ZXVF spark-2.0.0-bin-hadoop2.6.tgz-c ~/
I unzip here after the catalog of Spark is (/users/hduser/spark-2.0.0-bin-hadoop2.6)
3. Open Rstudio, install the relevant package
> install.packages ("Rjava")
> sys.setenv (spark_home= "/users/hduser/spark-2.0.0-bin-hadoop2.6")
> . libpaths (C (File.path (Sys.getenv ("Spark_home"), "R", "Lib"),. libpaths ()))
> Library (SPARKR)
Load package: ' Sparkr '
The following objects is masked from ' package:stats ':
CoV, filter, lag, na.omit, predict, SD, VAR, window
The following objects is masked from ' package:base ':
As.data.frame, Colnames, colnames<-, drop, EndsWith,
Intersect, rank, rbind, sample, StartsWith, subset,
Summary, transform, union
> sc <-sparkr.init (master= "local")
launching Java with spark-submit Command/users/hduser/spark-2.0.0-bin-hadoop2.6/bin/spark-submit Sparkr-shell/ VAR/FOLDERS/GC/VP7DHZPX6573T0FY46YSMPWR0000GP/T//RTMPYADAOX/BACKEND_PORT4EE21B15C06C
Using Spark ' s default log4j profile:org/apache/spark/log4j-defaults.properties
Setting default log level to "WARN".
To adjust logging level use Sc.setloglevel (newlevel).
16/12/11 19:52:32 WARN nativecodeloader:unable to load Native-hadoop library for your platform ... using Builtin-java Classes where applicable
Warning Message:
' Sparkr.init ' is deprecated.
Use ' sparkr.session ' instead.
See Help ("Deprecated")
> sqlcontext <-sparkrsql.init (SC)
Warning message:
' Sparkrsql.init ' is deprecated.
Use ' sparkr.session ' instead.
See Help ("Deprecated")
Read R built-in datasets with SqlContext faithful
> DF <-createdataframe (SqlContext, faithful)
warning message:
Createdataframe (SqlContext ...) ' is deprecated.
use" Createdataframe (data, schema = NULL, Samplingratio = 1.0) ' instead.
see Help (" Deprecated ")
> head (DF)
eruptions waiting
1 3.600
2 1.800 "
3 3.333
4 2.283
5 4.533.
6 2.883
> print (DF)
sparkdataframe[eruptions:double, waiting:double]
//test read JSON data
> people <-read.df (SqlContext, "/users/hduser/people.json", "JSON")
warning message:
Read.df (SqlContext ...) ' is deprecated.
use" read.df (path = null, Source = NULL, schema = NULL, ...)‘ instead.
see Help (" Deprecated ")
> Head (people)
Age Name
1 NA Michael
2 Andy
3 Justin
> Print (people)
Sparkdataframe[age:bigint, name:string]
Next Test Sparkr in the Web interface (shiny) display
Shiny Server Sparkr Web presentation Interface (ii)