016-hadoop Hive SQL Syntax detailed 6-job input/output optimization, data clipping, reduced job count, dynamic partitioning

Source: Internet
Author: User

I. Job input and output optimization

Use Muti-insert, union All, the union all of the different tables equals multiple inputs, union all of the same table, quite map output

Example

  

Second, data tailoring

2.1. Column Clipping

When hive reads the data, it can query only the columns that are needed, ignoring the other columns. You can even use an expression that is being expressed.

See. Http://www.cnblogs.com/bjlhx/p/6946202.html

2.2. Partition clipping

Reduce unnecessary partitioning during query

Example:

Select Count  from order_table where to_date (sale_time)='2014-03-03' and Hour (to_date (sale _time))=ten

After modification

Select Count  from order_table where  = ' 2014-03-03 ' to_date (sale_time) = ' 2014-03-03 '  and Hour (To_date (sale_time))=ten

You can use the explain dependency syntax to get input table and input partition

    

Third, using the optimization mechanism of hive to reduce the number of job

Whether it is an outer association outer join or an internal association inner join, if join key is the same, no matter how many tables, it will be merged into a mapreduce task

  

Iv. Rational use of dynamic partitioning

  

016-hadoop Hive SQL Syntax detailed 6-job input/output optimization, data clipping, reduced job count, dynamic partitioning

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.