CentOS installation R integration Hadoop, RHive configuration installation manual

Last Update:2015-07-13 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

RHive is a package that uses HIVE high-performance queries to expand R computing capabilities. It can easily call HQL in the R environment, and can also use R objects and functions in Hive. Theoretically, the data processing capacity can be expanded infinitely on the Hive platform, coupled with the R environment, which is a perfect environment for Big Data Analysis and mining.

Resource Package:

------------------------------------------ Split line ------------------------------------------

FTP address: ftp://ftp1.bkjia.com

Username: ftp1.bkjia.com

Password: www.bkjia.com

Installation of R integrated Hadoop and RHive configuration and installation manual on LinuxIDC.com/CentOS/July 15

For the download method, see

------------------------------------------ Split line ------------------------------------------

Install

First, the installation of hadoop and hive is skipped. This section describes how to install the R language in Centos and how to integrate Rhive into hadoop.

This experiment has eight nodes. Therefore, we need to install R and other modules on each node. First, let's take a look at how to install R.

Download the R-3.2.0.tar.gz in the package and unzip it

Install the following modules before Compilation

Run the following command:

Yum install gcc-gfortran gcc-c ++ libXt-devel openssl-devel readline-devel

RHive depends on Rserve. Therefore, when compiling and installing R, we mainly use the -- disable-nls -- enable-R-shlib parameter:

Cd R-3.2.0/

./Configure -- disable-nls -- enable-R-shlib

Make

Make install

Cd ../

Run the R command to install rJAVA, RHive, and other modules.

R cmd install rJava_0.9-6.tar.gz

R cmd install Rserve_1.8-3.tar.gz

R cmd install RHive_2.0-0.2.tar.gz

Note: If you have multiple nodes, please install the above modules in each node and master.

After the installation is complete, go to the Environment configuration section.

Configuration

1. Create a New RHIVE data storage path (local non-HDFS)

Save it in/www/store/rhive/data

2. Create the Rserv. conf file and write "remote enable" to save it to your specified directory.

Stored in/www/cloud/R/Rserv. conf

3. Modify the/etc/profile of each node and master to add environment variables.

Export RHIVE_DATA =/www/store/rhive/data

4. Upload All files in the lib directory under the R directory to the/rhive/lib directory in HDFS (if the directory does not exist, manually create one)

Cd/usr/local/lib64/R/lib

Hadoop fs-put./*/rhive/lib

Start

1. Run

R cmd Rserve -- RS-conf/www/cloud/R/Rserv. conf

Telnet cloud01 6311

Then telnet all slave nodes on the Master node. If Rsrv0103QAP1 is displayed, the connection is successful.

2. Start hive remote service: rhive is connected to hiveserver through thrift. You need to start the background thrift service, that is, start the hive remote service on the hive client. If you have already enabled this step, skip this step.

Nohup hive -- service hiveserver &

Rhive Test

Library (RHive)

Rhive. connect ("master", 10000, hiveServer2 = TRUE)

Finished!

Attached to the RHive documentation address https://github.com/nexr/RHive/wiki/User-Guide

Hive programming guide PDF (Chinese Version)

Hadoop cluster-based Hive Installation

Differences between Hive internal tables and external tables

Hadoop + Hive + Map + reduce cluster installation and deployment

Install in Hive local standalone Mode

WordCount word statistics for Hive Learning

Hive operating architecture and configuration and deployment

Hive details: click here
Hive: click here

This article permanently updates the link address:

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

CentOS installation R integration Hadoop, RHive configuration installation manual

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

CentOS installation R integration Hadoop, RHive configuration installation manual

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support