Hadoop automated O & M-deb package Creation

Source: Internet
Author: User
Tags hadoop mapreduce

In the first blog article of article 2014, we will gradually write a series of New Year's news.


Deb/rpm of hadoop and its peripheral ecosystems is of great significance for automated O & M. The rpm and deb of the entire ecosystem are established and then the local yum or apt source is created, this greatly simplifies hadoop deployment and O & M. In fact, both cloudera and hortonworks do this.


I wanted to write both rpm and deb, but it is estimated that the space is not enough. Let's separate them. Let's start with deb. Deb is easier to create and does not need to write any spec script.


Taking hadoop 2.2.0 as an example, apache does not provide 2.0-based rpm and deb, so we have to create our own modified rpm and deb.


1. Download The hadoop compiled package, which is about 100 MB, and decompress the package

#wget http://mirrors.cnnic.cn/apache/hadoop/common/hadoop-2.2.0/hadoop-2.2.0.tar.gz#tar zxf hadoop-2.2.0.tar.gz


2. Create a folder for packaging

#mkdir -p /opt/hadoop_2.2.0-1_amd64/DEBIAN#mkdir -p /opt/hadoop_2.2.0-1_amd64/usr#mkdir -p /opt/hadoop_2.2.0-1_amd64/etc

DEBIAN is used for packaging scripts, and usr and etc are the paths to be installed after packaging in the future. After the package is complete, the usr directory corresponds to the/usr directory in the future linux system, and the etc directory corresponds to the/etc directory of the linux system.


3. Copy the items in hadoop to the destination folder

The following folders should exist in the hadoop-2.2.0 folder after the first step is decompressed.

-Bin

-Etc

-- |-Hadoop

-Sbin

-Share

-Lib

-Libexec

-Include

The approximate folder structure of hadoop in the original tar package is like this. Then execute the copy operation.

#tar zxf hadoop-2.2.0.tar.gz#cd hadoop-2.2.0#cp -rf bin sbin lib libexec share include /opt/hadoop_2.2.0-1_amd64/usr/#cp -rf etc/hadoop /opt/hadoop_2.2.0-1_amd64/etc/


The copied package Folder/opt/hadoop_2.2.0-1_amd64/directory structure should be as follows


-DEBIAN

-Etc

-- |-Hadoop

-Usr

-- |-Bin

-- |-Sbin

-- |-Include

-- |-Lib

-- |-Libexec

-- |-Share


Then, write the control files in the DEBIAN folder. ubuntu and debian are easier to package than rpm. You only need to write several independent script files.


Go to the DEBIAN folder and edit the metadata file control first.

#cd /opt/hadoop_2.2.0-1_amd64/DEBIAN#vi control

Enter the following content

Package: hadoopVersion: 2.2.0-GASection: miscPriority: optionalArchitecture: amd64Provides: hadoopMaintainer: XiangleiDescription: The Apache Hadoop project develops open-source software for reliable, scalable, distributed computing.


Save and exit, and then edit the conffile in the same directory to monitor the changes in the configuration file after installation, so that the changed configuration file is retained during uninstallation.

#vi /opt/hadoop_2.2.0-1_amd64/DEBIAN/conffile

Enter the following content

/etc/hadoop/core-site.xml/etc/hadoop/hdfs-site.xml/etc/hadoop/mapred-site.xml/etc/hadoop/yarn-site.xml/etc/hadoop/hadoop-env.sh/etc/hadoop/yarn-env.sh


Continue. There are also four control files to be edited, namely post-install postinst, pre-delete postrm, pre-install preinst, and pre-delete prerm, all of which are written in script form. Put it together.

# Vi postinst # ------ mkdir-p/usr/etcln-s/etc/hadoop/usr/etc/hadooprm-f/etc/hadoop # ------ # vi postrm #------/ usr/sbin/userdel hdfs 2>/dev/null/usr/sbin/userdel mapred 2>/dev/null/usr/sbin/ groupdel hadoop 2>/dev/null> dev/nullexit 0 # ------ # vi preinst # ------ getent group hadoop 2>/dev/null |/usr/sbin /groupadd-g 123-r hadoop/usr/sbin/useradd -- comment "Hadoop MapReduce"-u 202 -- shell/bin/bash-M-r -- groups hadoop -- home/var /lib/hadoop/mapred 2>/dev/null |: /usr/sbin/useradd -- comment "Hadoop HDFS"-u 201 -- shell/bin/bash-M-r -- groups hadoop -- home/var/lib/hadoop/hdfs 2 >/dev/null |: # ------ # vi prerm # ------ # Leave the content empty #------


At this time, it is basically done. Of course, you also need to modify the path output configuration in the hadoop script to adapt to the path after packaging and installation. This is very simple and there is nothing to say.


Then execute

#cd /opt#dpkg -b hadoop_2.2.0-1_amd64

Then you will get the installation package for the hadoop_2.2.0-1_amd64.deb. Run the dpkg-I command to install it. I am going to cook and eat it. Next time I will talk about how to make apt source and rpm packages.


This article was posted on the "practice test truth" blog and declined to be reproduced!

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.