Cdh5hadoopredhat local repository Configuration
Cdh5 hadoop redhat local repository Configuration
Location of the cdh5 Website:
Http://archive-primary.cloudera.com/cdh5/redhat/6/x86_64/cdh/
It is very easy to configure pointing to this repo On RHEL6, As long:
Http://archive-primary.cloudera.com/cdh5/redhat/6/x86_64/cdh/cloudera-cdh5.repo
Download and store it locally:
/Etc/yum. repos. d/cloudera-cdh5.repo
But if it is offline, the network connection is not available, you need to mirror the entire resource locally and then configure it in the cloudera-cdh5.repo. I wrote a script to download the entire site. Although wget can be used as a command, I wrote one to practice shell scripts. The basic idea is to analyze the web page, find the resource link, and store it to a local directory. In the Script: PATH_MUST_BE_EXSITED must point to an existing local directory. No nonsense. Go to the Code:
#!/bin/bash## @file# cdh5_rhel6-downloads.sh## @date# 2014-12-18## @author# cheungmine## @version# 0.0.1pre## downloads all from CDH_URL_PREFIX:# http://archive.cloudera.com/cdh5/redhat/6/x86_64/cdh/################################################################################## specify where you want to save downloaded packages here:#PATH_MUST_BE_EXSITED="../libs/cdh"# get real path from relative pathfunction real_path() { \cd "$1" /bin/pwd}# server dist resources:#CDH_URL_PREFIX="http://archive.cloudera.com/cdh5/redhat/6/x86_64/cdh"CDH_GPGKEY=$CDH_URL_PREFIX"/RPM-GPG-KEY-cloudera"CDH_REPO=$CDH_URL_PREFIX"/cloudera-cdh5.repo"CDH5_REPODATA=$CDH_URL_PREFIX"/5/repodata/"CDH5_RPMS_NOARCH=$CDH_URL_PREFIX"/5/RPMS/noarch/"CDH5_RPMS_X86_64=$CDH_URL_PREFIX"/5/RPMS/x86_64/"# source packages not used:CDH5_SRPMS=$CDH_URL_PREFIX"/5/SRPMS/"# get local absolute path for storing the downloaded:CDH5_LOCALPATH=$(real_path $PATH_MUST_BE_EXSITED)echo "**** downloaded packages will be stored in folder: "$CDH5_LOCALPATH# first we get index pages:#repodata_html=$CDH5_LOCALPATH"/.repodata.index.html"x86_64_html=$CDH5_LOCALPATH"/.x86_64.index.html"noarch_html=$CDH5_LOCALPATH"/.noarch.index.html"wget -c $CDH5_REPODATA -P $CDH5_LOCALPATH -O $repodata_htmlwget -c $CDH5_RPMS_NOARCH -P $CDH5_LOCALPATH -O $noarch_htmlwget -c $CDH5_RPMS_X86_64 -P $CDH5_LOCALPATH -O $x86_64_htmlwget -c $CDH_GPGKEY -P $CDH5_LOCALPATHwget -c $CDH_REPO -P $CDH5_LOCALPATH# download repodata# CDH5_REPODATArepodata_dir=$CDH5_LOCALPATH"/5/repodata"mkdir -p $repodata_direcho -e "process file: '$repodata_html'"while read linedo # start with: <td><a href=" a=`echo $line | sed -n '/<td><a href="/p'` if [ -n "$a" ]; then b=`echo $a | sed -n '/Parent Directory/p'` # do including: Parent Directory if [ -z "$b" ]; then # end with: </a></td> b=`echo $a | sed -n '/<\/a><\/td>/p'` if [ -n "$b" ]; then a=`echo $a | sed -e 's/.*<td><a href="//;s/">.*//'` url=$CDH5_REPODATA$a echo -e "download: $url" wget -c $url -P $repodata_dir -O $repodata_dir/$a fi fi fidone < $repodata_html# download noarch# CDH5_RPMS_NOARCHnoarch_dir=$CDH5_LOCALPATH"/5/RPMS/noarch"mkdir -p $noarch_direcho -e "process file: '$noarch_html'"while read linedo # start with: <td><a href=" a=`echo $line | sed -n '/<td><a href="/p'` if [ -n "$a" ]; then b=`echo $a | sed -n '/Parent Directory/p'` # do including: Parent Directory if [ -z "$b" ]; then # end with: </a></td> b=`echo $a | sed -n '/<\/a><\/td>/p'` if [ -n "$b" ]; then a=`echo $a | sed -e 's/.*<td><a href="//;s/">.*//'` url=$CDH5_RPMS_NOARCH$a echo -e "download: $url" wget -c $url -P $noarch_dir -O $noarch_dir/$a fi fi fidone < $noarch_html# download x86_64# CDH5_RPMS_X86_64x86_64_dir=$CDH5_LOCALPATH"/5/RPMS/x86_64"mkdir -p $x86_64_direcho -e "process file: '$x86_64_html'"while read linedo # start with: <td><a href=" a=`echo $line | sed -n '/<td><a href="/p'` if [ -n "$a" ]; then b=`echo $a | sed -n '/Parent Directory/p'` # do including: Parent Directory if [ -z "$b" ]; then # end with: </a></td> b=`echo $a | sed -n '/<\/a><\/td>/p'` if [ -n "$b" ]; then a=`echo $a | sed -e 's/.*<td><a href="//;s/">.*//'` url=$CDH5_RPMS_X86_64$a echo -e "download: $url" wget -c $url -P $x86_64_dir -O $x86_64_dir/$a fi fi fidone < $x86_64_html# TODO: do we need to check all packages?# remove index pages:rm -f $repodata_html $x86_64_html $noarch_htmlecho "download all packages successfully."
The above script can be run multiple times without repeated downloads. All content of cdh5 is saved in PATH_MUST_BE_EXSITED. Finally, upload all the content of PATH_MUST_BE_EXSITED to the local ftp server to ensure that you can access:
Ftp: // your-server-ip/pub/libs/cdh/
Then, add a repo file on the RHEL6 machine to be accessed. Mine is:
#/Etc/yum. repos. d/cdh5.repo
[cloudera-cdh5]# Packages for Cloudera's Distribution for Hadoop, Version 5, on RedHator CentOS 6 x86_64name = Cloudera's Distribution for Hadoop, Version 5baseurl = ftp://your-server-ip/pub/libs/cdh/5/gpgkey = ftp://your-server-ip/pub/libs/cdh/RPM-GPG-KEY-cloudera gpgcheck = 1enabled = 1
Now, the local repository of cdh has been created. As long as the/etc/yum. repos. d/cdh5.repo file exists on the ftp server, you can install the cdh hadoop software package. For example, to install a zookeeper Server:
#Installing the ZooKeeper Base Package $ yum install zookeeper# Installing the ZooKeeper Server Package $ yum install zookeeper-server# start zookeeper-server $ service zookeeper-server init --myid=1Using myid of 1
Install zookeeper in/usr/lib/zookeeper and go to the bin directory to start and close it:
$ zkServer.sh start$ zkServer.sh stop