The order in which SQL is executed in hive.

Source: Internet
Author: User

Today in the test when writing hive SQL, notice that we put the order by behind the query, and if there is no field, an error is executed.

He will report the following mistake. In fact, at this time we think about whether it is wrong to write. But take a closer look. There seems to be no mistake.

At this point we may wonder if there is a problem with the order in which the statements are executed in hive. So I add the fields after the order by in the back of the previous select:

Sure enough, after that, there is no problem with execution. This is the time to summarize the order in which the hive statements are executed.
The order in which queries are executed in the EXECUTE statement of Hive:

This is a sql:

Select ... from ... .... GROUP by ... having ... order by .....

Execution order:

From ... select ... Group by ... ... have ... order by ..... ...

In fact, the execution order of the summary hive is also a summary of the execution sequence of MapReduce:

The execution order of the MR Program:

Map phase:

1. Perform from load, make table lookup and load

2. Performing a where filter to filter and filter conditions

3. Executing a SELECT query: Filtering for output items

4. Perform GROUP BY grouping: Describes the functions that need to be computed after grouping

5.map file Merge: Map side local Overflow Write file merge operation, each map eventually formed a temporary file. Then map to the corresponding Reducereduce stage by column:

Reduce phase:

1.group by: Group and calculate the data sent over the map side.

2.select: Last filter column for output results

3.limit sorting results output to HDFs file

So by the above example we can see that after SELECTT we will form a table and do the sorting in this table.

The order in which SQL is executed in hive.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.