Hive最佳化之自動合并輸出的小檔案
1.先在hive-site.xml中設定小檔案的標準.
<property>
<name>hive.merge.smallfiles.avgsize</name>
<value>536870912</value>
<description>When the average output file size of a job is less than this number, Hive will start an additional map-reduce job to merge the output files into bigger files. This is only done for map-only jobs if hive.merge.mapfiles is true, and for map-reduce jobs if hive.merge.mapredfiles is true.</description>
</property>
2.為只有map的mapreduce的輸出併合並小檔案.
<property>
<name>hive.merge.mapfiles</name>
<value>true</value>
<description>Merge small files at the end of a map-only job</description>
</property>
3.為含有reduce的mapreduce的輸出併合並小檔案.
<property>
<name>hive.merge.mapredfiles</name>
<value>true</value>
<description>Merge small files at the end of a map-reduce job</description>
</property>
Hive編程指南 PDF 中文高清版
基於Hadoop叢集的Hive安裝
Hive內表和外表的區別
Hadoop + Hive + Map +reduce 叢集安裝部署
Hive本地獨立模式安裝
Hive學習之WordCount單詞統計
Hive運行架構及配置部署
Hive 的詳細介紹:請點這裡
Hive 的:請點這裡
本文永久更新連結地址: