Cdh5 Hadoop Redhat Local warehouse configuration
CDH5 site location on the site:
http://archive-primary.cloudera.com/cdh5/redhat/6/x86_64/cdh/
Configuring on RHEL6 to point to this repo is very simple, just put:
Http://archive-primary.cloudera.com/cdh5/redhat/6/x86_64/cdh/cloudera-cdh5.repo
To download the store locally, you can:
/etc/yum.repos.d/cloudera-cdh5.repo
However, if the network connection is not available offline, the entire resource needs to be mirrored locally and then configured in Cloudera-cdh5.repo. I wrote a script to download the entire site. Although with wget a command can be done, in order to practice the shell script, I wrote a. The basic idea is to analyze the Web page, find the resource link, and store it in a local directory. In the script: path_must_be_exsited must point to a local directory that already exists. No nonsense, on the code:
#!/bin/bash## @file # cdh5_rhel6-downloads.sh## @date # 2014-12-18## @author # cheungmine## @version # 0.0.1pre## down Loads all from cdh_url_prefix:# http://archive.cloudera.com/cdh5/redhat/6/x86_64/cdh/############################# ##################################################### specify where you want to save downloaded packages here: #PATH_ Must_be_exsited= ". /LIBS/CDH "# Get real path from relative pathfunction Real_path () {\CD" $ "/bin/pwd}# Server dist Resources: #CDH_UR l_prefix= "HTTP://ARCHIVE.CLOUDERA.COM/CDH5/REDHAT/6/X86_64/CDH" cdh_gpgkey= $CDH _url_prefix "/ Rpm-gpg-key-cloudera "cdh_repo= $CDH _url_prefix"/cloudera-cdh5.repo "cdh5_repodata= $CDH _url_prefix"/5/repodata/" Cdh5_rpms_noarch= $CDH _url_prefix "/5/rpms/noarch/" cdh5_rpms_x86_64= $CDH _url_prefix "/5/rpms/x86_64/" # source Packages not used:cdh5_srpms= $CDH _url_prefix "/5/srpms/" # get local absolute path for storing the downloaded:cdh5_ localpath=$ (Real_path $PATH _must_be_exsited) echo "* * * * Downloaded packages'll be StORed in folder: "$CDH 5_localpath# First we get index pages: #repodata_html = $CDH 5_localpath"/.repodata.index.html "x86_64 _html= $CDH 5_localpath "/.x86_64.index.html" noarch_html= $CDH 5_localpath "/.noarch.index.html" Wget-c $CDH 5_repodata -P $CDH 5_localpath-o $repodata _htmlwget-c $CDH 5_rpms_noarch-p $CDH 5_localpath-o $noarch _htmlwget-c $CDH 5_rpms_x86_64 -P $CDH 5_localpath-o $x 86_64_htmlwget-c $CDH _gpgkey-p $CDH 5_localpathwget-c $CDH _repo-p $CDH 5_localpath# Download re podata# cdh5_repodatarepodata_dir= $CDH 5_localpath "/5/repodata" mkdir-p $repodata _direcho-e "process file: ' $repodata _html ' "While the read Linedo # start with: <td><a href=" A= ' echo $line | Sed-n '/<td><a href= '/P ' if [-n ' $a]; Then b= ' echo $a | Sed-n '/parent directory/p ' # do including:parent Directory if [-Z ' $b ']; Then # End With: </a></td> b= ' echo $a | Sed-n '/<\/a><\/td>/p ' if [-N ' $b ']; Then A= ' echo $a | Sed-e ' s/.*<td><a href= "//;s/" >.*//"url= $CDH 5_repodata$a echo-e" Download: $ URL "wget-c $url-P $repodata _dir-o $repodata _dir/$a fi fi fidone < $repodata _htm l# download noarch# cdh5_rpms_noarchnoarch_dir= $CDH 5_localpath "/5/rpms/noarch" mkdir-p $noarch _direcho-e "process File: ' $noarch _html ' "While the read Linedo # start with: <td><a href=" A= ' echo $line | Sed-n '/<td><a href= '/P ' if [-n ' $a]; Then b= ' echo $a | Sed-n '/parent directory/p ' # do including:parent Directory if [-Z ' $b ']; Then # End With: </a></td> b= ' echo $a | Sed-n '/<\/a><\/td>/p ' if [-N ' $b ']; Then A= ' echo $a | Sed-e ' s/.*<td><a href= "//;s/" >.*//"url= $CDH 5_rpms_noarch$a echo-e" Download : $url "wget-c $url-P $noarch _dir-o $noarch_dir/$a fi fi fidone < $noarch _html# download x86_64# cdh5_rpms_x86_64x86_64_dir= $CDH 5_localpat H "/5/rpms/x86_64" mkdir-p $x 86_64_direcho-e "process file: ' $x 86_64_html '" While the read Linedo # start with: <TD>&L T;a href= "A= ' echo $line | Sed-n '/<td><a href= '/P ' if [-n ' $a]; Then b= ' echo $a | Sed-n '/parent directory/p ' # do including:parent Directory if [-Z ' $b ']; Then # End With: </a></td> b= ' echo $a | Sed-n '/<\/a><\/td>/p ' if [-N ' $b ']; Then A= ' echo $a | Sed-e ' s/.*<td><a href= "//;s/" >.*//"url= $CDH 5_rpms_x86_64$a echo-e" Download : $url "wget-c $url-P $x 86_64_dir-o $x 86_64_dir/$a fi fi fidone < $x 86_64_html# Todo:do we need to check all packages?# Remove index pages:rm-f $repodata _html $x 86_64_html $noarch _htmlecho "Download a LL packages SUCCEssfully. "
The above script can be run multiple times and will not be downloaded repeatedly. Path_must_be_exsited inside the Cdh5 all content. Finally, upload the entire contents of the path_must_be_exsited to the local FTP server and make sure you can access:
ftp://your-server-ip/pub/libs/cdh/
Then on the RHEL6 machine that needs to be accessed, add a repo file, mine is:
#/etc/yum.repos.d/cdh5.repo
[cloudera-cdh5]# Packages for Cloudera's distribution for Hadoop, Version 5, on Redhator CentOS 6 x86_64name = Clouder A ' s distribution for Hadoop, Version 5baseurl = Ftp://your-server-ip/pub/libs/cdh/5/gpgkey = ftp:// Your-server-ip/pub/libs/cdh/rpm-gpg-key-cloudera Gpgcheck = 1
Well, CDH's local warehouse has been built. You can install the CDH Hadoop package as long as the/etc/yum.repos.d/cdh5.repo file exists in the server that can access the FTP. For example, install a zookeeper server:
#Installing the ZooKeeper Base package $ yum Install zookeeper# installing the ZooKeeper Server package $ yum Inst All zookeeper-server# start Zookeeper-server
Ok!
Cdh5 Hadoop Redhat Local warehouse configuration