Hive usage Summary

Last Update:2014-10-15 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

It's almost a year since I got in touch with hive. I have used some of my work experience as follows:

1) keep in mind that hive is just a hadoop-based data warehouse tool that converts SQL into mapreduce. Its strength lies in data statistics, convenient and flexible development and testing, and complex

We recommend that you use a temporary table to process ETL logic in stages or write mapreduce programs for processing.

2) Check whether hive SQL causes data skew. Solutions to data skew. Measure the test taker's knowledge about your data distribution, for example, whether some keys are multiple times of other keys, or the associated keys are empty.

3) a stable scheduling system is very important. Because hive And Tez may cause unexpected errors during operation, the scheduling system is very good at automatically re-running the online steps for 2 or 3 times.

4) in Perl and python, try to run hql in a step as much as possible. It is much easier to catch up with unexpected errors.

5) Try to understand how hql is converted into mapreduce, which helps performance tuning and troubleshooting.

Hive usage Summary

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Hive usage Summary

Contact Us

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support