R Language and Big data

Source: Internet
Author: User
Tags sparkr

#安装R语言
R3.3 version will have a variety of so non-existent problems, go back to the R3.1 version of the smooth installation.
Before installing the R environment, install the Chinese (if not the chart shows the Chinese characters in the box) and the TCL/TK bag (less This can not install sqldf)
sudo yum install Fonts-chinese tcl tcl-devel tclx tk tk-devel-y
Reload the service XFS reload after installing the Chinese font (but always fail on a machine, then restart the machine OK)
sudo service XFS Reload
Some packages will require RGL to call OpenGL libraries, so also install OpenGL packages
sudo yum install Mesa-libglu mesa-libglu-devel-y
sudo yum install gcc-gfortran gcc gcc-c++ readline-devel libxt-devel-y

wget--no-check-certificate https://stat.ethz.ch/CRAN/src/base/R-3/R-3.1.0.tar.gz
Tar xvf r-3.1.0.tar.gz
./configure--enable-r-shlib=yes--enable-blas-shlib=yes--with-lapack--with-libpng--with-x=no--WITH-TCLTK
sudo sh-c "make"
sudo sh-c "make install"

sudo R CMD javareconf java_home= $JAVA _home
Go into R.
Install.packages (' Rjava ')
Select 22

Installing DBI
Install.packages ("DBI")
Select 22

Installing Rsqlite
Install.packages ("Rsqlite")
Select 22

#安装RStudio
R installed on the working laptop, due to memory limitations, can only be used for some very small data set analysis, so a better way is to configure a R plus rstudio-server on Linux, and then directly through the Web to access R functionality.

Download Rstudio-server rpm package to install
wget http://download2.rstudio.org/rstudio-server-0.97.551-x86_64.rpm
RPM-IVH--nodeps rstudio-server-0.97.551-x86_64.rpm
Start command

The pits are starting to fail without any error message, error messages found in/var/log/messages
When installing, if the prompt is missing libr.so, uninstall with make Unsintall, then reinstall R, specify enable-r-shlib and no more error
PS aux can see that/usr/lib/rstudio-server/bin/rserver has been activated.

The configuration file is located in/etc/rstudio/rserver.conf
WWW-PORT=80,80 is the default HTTP service port number.
Rsession-ld-library-path=/opt/local/lib:/opt/local/someapp/lib specifying additional library addresses
Rsession-which-r=/usr/local/bin/r specifying the R software location
Auth-required-user-group=rstudio_users Limit Login to R users
rsession-memory-limit-mb=4000 limit the maximum memory used
rsession-stack-limit-mb=10 limit the maximum stack size
rsession-process-limit=100 limit the maximum number of processes
SESSION-TIMEOUT-MINUTES=30 Process Time-out
R-libs-user=~/r/packages Setting the default R package
limit-file-upload-size-mb=100 Setting the maximum upload file size
r-cran-repos=http://cran.case.edu/Setting the default Cran

#登陆访问

To open the http://hostip:8787/page.
You need to enter the login account and password, the account is the domain account name of the machine, the password is the machine password. If not, the permission to apply for this machine is available.
In addition, you need to set environment variables to be accessible, and use the following command to set the current session environment variables
Sys.setenv (SHELL = "/bin/bash")


#最新进展
R or load the data locally to calculate, this way in the big data age is somewhat outdated. At present, R and Hadoop are combined with Rhadoop, rhive, Rhbase, Sparkr, and so on, rhive and Rodps adopt a similar approach, with the Lib interface for access. The more thorough is the SPARKR, in the API and the runtime to make the modification. Using R or Python's Dataframe API gives you almost the same performance as Scala.

R Language and Big data

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.