Hive Order by operation

Source: Internet
Author: User
Tags parse error

Common advanced queries in hive include:group BY, Order by, join, distribute by, sort by, cluster by, Union all. today we look at the order by operation, and order by indicates that some fields are sorted by the following syntax:

[Java]View PlainCopy
    1. Select Col,col2 ...
    2. From TableName
    3. Where condition
    4. Order by Col1,col2 [Asc|desc]

Attention:

(1):order by can be sorted by more than one column, sorted by dictionary by default.

(2):order BY is a global sort.

(3):The order by requires the reduce operation, and only one reduce, cannot be configured (because multiple reduce cannot complete the global sort).

The order by operation is subject to the following attributes:

[Java]View PlainCopy
    1. Set hive.mapred.mode=nonstrict; (default value/defaults)
    2. Set hive.mapred.mode=strict;

Note: If you use the ORDER BY statement in strict mode, you must add the Limit keyword to the statement because only a single reduce can be started when you execute an order by, and the execution time can be lengthy if the ordered result set is too large.

Let's take a look at the use of order by in one example:

The database has a employees table with the following data:

[Java]View PlainCopy
  1. Hive> SELECT * FROM Employees;
  2. Ok
  3. lavimer 15000.0 [ "Li", "K1": 1.0, "K2": 2.0, "K3": 3.0}    { "street": "Dingnan", "City": 101}  2015-01-24  love  
  4. Liao    18000.0 [ "Liu", "K4": Span class= "number" >2.0, "K5": 3.0, "K6": 6.0}    { "Dingnan", "City": 102 } 2015-01-24  love  
  5. Zhang 19000.0 ["Xiao","Wen","Tian"] {"K7":7.0,"K8":8.0,"K8": 8.0} {"St Reet ":" Dingnan "," City ":" Ganzhou "," num ":103} 2015-01-

Now I want to sort by the second column (salary) in descending order:

[Java]View PlainCopy
  1. Hive> SELECT * FROM Employees ORDER BY salary Desc;
  2. The process of executing the MapReduce
  3. Job 0:map: 1 Reduce: 1 Cumulative CPU: 2.62 sec HDFs Read: 415 HDFS Write: 245 SUCCESS
  4. Total MapReduce CPU time spent: 2 seconds 620 msec
  5. Ok
  6. Zhang   19000.0 [ "K7": 7.0, "K8": 8.0} { " Street ": " Dingnan ", " Ganzhou ", "num": 103} 2015-01-< Span class= "number" >24&NBSP;&NBSP;LOVE&NBSP;&NBSP;
  7. liao    18000.0 [ "K4": 2.0, "K5": 3.0, "K6": 6.0}    { "street": "Dingnan" , 102} 2015-01-24  love   
  8. Lavimer 15000.0 ["Li","Lu","Wang"] {"K1":1.0,"K2":2.0,"K3":3.0} {" Street ":" Dingnan "," City ":" Ganzhou "," num ":101} 2015-01-
  9. Time taken: 20.484 seconds
  10. Hive>


The Hive.mapred.mode property at this time is:

[Java]View PlainCopy
    1. Hive> set Hive.mapred.mode;
    2. Hive.mapred.mode=nonstrict
    3. Hive>


Now let's change it to strict and then use order by to query:

[Java]View PlainCopy
    1. Hive> set hive.mapred.mode=strict;
    2. Hive> SELECT * FROM Employees ORDER BY salary Desc;
    3. Failed:error in semantic Analysis: 1:at strict mode, if ORDER by was specified, LIMIT must also be specifie D. Error encountered near token ' salary '
    4. Hive>

Note: Queries must be prefixed with the Limit keyword in strict mode.

[Java]View PlainCopy
    1. Hive> SELECT * FROM Employees ORDER BY salary desc limit 3;
    2. Failed:error in semantic analysis:no partition predicate found for Alias "Employees" Table "Employees"

Note: Another thing to note is that the strict mode also restricts the query for partitioned tables, and the solution is to specify the partition

First take a look at the partition:

[Java]View PlainCopy
    1. Hive> Show partitions employees;
    2. Ok
    3. Date_time=2015-01-24/type=love
    4. Time taken: 0.096 seconds


In strict mode, use the ORDER by query first:

[Java]View PlainCopy
  1. Hive> SELECT * FROM Employees where partition (date_time=' 2015-01-24 ', type=' love ') Order by salary desc limit 3;
  2. Failed:parse error:line 1:cannot recognize input near ' partition ' (" date_time ' in expression s Pecification
  3. hive                                                                                                                      
  4. > select * FROM Employees where date_time=' 2015-01-24 ' and type=' love ' ORDER by salary desc limit 3;
  5. Execute a mapreduce program
  6. Total MapReduce CPU time spent: 3 seconds 510 msec
  7. Ok
  8. zhang   19000.0 [ "Xiao", "K7": 7.0, "K8": 8.0} { " Street ": " Dingnan ", " Ganzhou ", "num": 103} 2015-01-< Span class= "number" >24&NBSP;&NBSP;LOVE&NBSP;&NBSP;
  9. Liao    18000.0 [ "Liu", "K4": Span class= "number" >2.0, "K5": 3.0, "K6": 6.0}    { "Dingnan", "City": 102 } 2015-01-24  love  
  10. Lavimer 15000.0 ["Li","Lu","Wang"] {"K1":1.0,"K2":2.0,"K3":3.0} {" Street ":" Dingnan "," City ":" Ganzhou "," num ":101} 2015-01-
  11. Time taken: 19.861 seconds
  12. Hive>





Hive Order by operation

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.