I. Job input and output optimization
Use Muti-insert, union All, the union all of the different tables equals multiple inputs, union all of the same table, quite map output
Example
Second, data tailoring
2.1. Column Clipping
When hive reads the data, it can query only the columns that are needed, ignoring the other columns. You can even use an expression that is being expressed.
See. Http://www.cnblogs.com/bjlhx/p/6946202.html
2.2. Partition clipping
Reduce unnecessary partitioning during query
Example:
Select Count from order_table where to_date (sale_time)='2014-03-03' and Hour (to_date (sale _time))=ten
After modification
Select Count from order_table where = ' 2014-03-03 ' to_date (sale_time) = ' 2014-03-03 ' and Hour (To_date (sale_time))=ten
You can use the explain dependency syntax to get input table and input partition
Third, using the optimization mechanism of hive to reduce the number of job
Whether it is an outer association outer join or an internal association inner join, if join key is the same, no matter how many tables, it will be merged into a mapreduce task
Iv. Rational use of dynamic partitioning
016-hadoop Hive SQL Syntax detailed 6-job input/output optimization, data clipping, reduced job count, dynamic partitioning