Configuring hive compression based on Cloudera MANAGER5

Source: Internet
Author: User

[Author]: Kwu

Configuring hive compression based on Cloudera MANAGER5 configures the compression of hive, which is actually the compression of the configuration MapReduce, including the running results and the compression of intermediate results.

1. Configuration based on hive command line

Set Hive.enforce.bucketing=true;set Hive.exec.compress.output=true;set Mapred.output.compress=true;set Mapred.output.compression.codec=org.apache.hadoop.io.compress.gzipcodec;set io.compression.codecs= Org.apache.hadoop.io.compress.GzipCodec;

In the command line of Hive Run as above code, here is gzip compression.


2. xml file-based compression configuration

Mapred-site.xml

<property>  <name>mapred.output.compress</name>  <value>true</value>  <description>should the job outputs be compressed?  </description></property><property>  <name>mapred.output.compression.codec</name >  <value>org.apache.hadoop.io.compress.GzipCodec</value>  <description>if the job Outputs is compressed, how should they be compressed?  </description></property>

Hive-site.xml

<property>  <name>hive.enforce.bucketing</name>  <value>true</value></ property><property>  <name>hive.exec.compress.output</name>  <value>true</ value></property><property>  <name>io.compression.codecs</name>  <value >org.apache.hadoop.io.compress.GzipCodec</value></property>

3.Configuring hive compression based on Cloudera Manager5

1) The Mr Configuration based on yarn



2) Configuration of Hive


Add the following content

<property>  <name>hive.enforce.bucketing</name>  <value>true</value></ property><property>  <name>hive.exec.compress.output</name>  <value>true</ value></property><property>  <name>io.compression.codecs</name>  <value >org.apache.hadoop.io.compress.GzipCodec</value></property>

The configuration is complete, and MapReduce includes the hive run results in gzip compression.


Configuring hive compression based on Cloudera MANAGER5

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.