Tez consolidates Hadoop CDH 5.3.0 installation Deployment

Source: Internet
Author: User
Tags hadoop fs

Zi Yue     : gentleman food without seeking a full, home without seeking Ann, sensitive to things and cautious in words, on the way and is Yan, can be studious also has.

    Gentleman eat not too full, live not too comfortable, work diligently, speak cautiously, to the people of high moral learning, and can correct their shortcomings, so that can be called studious.

Recently, to replace the CDH version with the 5.3.0,hive version from 0.12 to 0.13, after the upgrade is complete, the simple test found that the upgrade of the version of the performance is very large. Hive started in 0.13 to support Tez as the execution engine to improve execution speed.

  

Comparison of Tez and MR:

The diagram shows that the original MR program is a multi-job dag, and each job writes and reads disks, wasting disk IO and network IO. Tez changes the multi-job Dag to a single job DAG task, reducing the operation of intermediate results.

Installation deployment for Tez:

Hadoop version: 2.5.0

Hive Version: 0.13

Tez version: 0.4.1

1.) Download Tez source code, address: http://archive.apache.org/dist/incubator/tez/tez-0.4.1-incubating/

2.) Compile:

1: Environment dependent

A, JDK 1.7 +

B, MAVEN 3.0 +

C, Protocolbuffer 2.5.0

2: Modify the Hadoop version in Pom.xml to the corresponding version number 2.5.0

3: Compile command:mvn clean package-dskiptests=true-dmaven.javadoc.skip=true wait quietly .....

3.) Upload the compiled Tez's tarball to each machine in the cluster and unzip it into the directory that you want to install.

4.) Upload the extracted files from Tez into HDFs

。 Create directory Hadoop Fs-mkdir/apps

。 Upload file Hadoop fs-put {tez_home}/apps/

5.) Create a new tez-site.xml configuration file in the Hadoop profile directory

。 Add Configuration Tez.lib.uris

1 < Property>2   <name>Tez.lib.uris</name>3   <value>${fs.defaultfs}/apps/tez,${fs.defaultfs}/apps/tez/lib</value>4 </ Property>

6.) Modify Mapred-site.xml

7.) Add the Tez jar to the Hadoop_classpath in hadoop-env.sh

1 Export tez_home={tez_home}2 Export Hadoop_classpath=${hadoop_classpath}:${tez_home} /* : ${tez_home}/lib/*

8.) At the end of the deployment of Tez, test program running Tez: Hadoop jar Tez-tests.jar testorderedwordcount <input> <output>

If it works, it means the deployment was successful.

9.) Modify hive's execution engine for Tez

 1 <property>2 << Span style= "color: #800000;" >name>hive.execution.engine</name>3 <value>tez</ Value>4 </property>          

  

To this Tez integration CDH 5.3.0 is done. Get started with stability testing and performance optimization.

  

Tip: Tez is required for each machine to be deployed.

  Tez official website: http://tez.apache.org/index.html

Tez consolidates Hadoop CDH 5.3.0 installation Deployment

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.