Common advanced queries in hive include:group BY, Order by, join, distribute by, sort by, cluster by, Union all. today we look at the order by operation, and order by indicates that some fields are sorted by the following syntax:
[Java]View PlainCopy
- Select Col,col2 ...
- From TableName
- Where condition
- Order by Col1,col2 [Asc|desc]
Attention:
(1):order by can be sorted by more than one column, sorted by dictionary by default.
(2):order BY is a global sort.
(3):The order by requires the reduce operation, and only one reduce, cannot be configured (because multiple reduce cannot complete the global sort).
The order by operation is subject to the following attributes:
[Java]View PlainCopy
- Set hive.mapred.mode=nonstrict; (default value/defaults)
- Set hive.mapred.mode=strict;
Note: If you use the ORDER BY statement in strict mode, you must add the Limit keyword to the statement because only a single reduce can be started when you execute an order by, and the execution time can be lengthy if the ordered result set is too large.
Let's take a look at the use of order by in one example:
The database has a employees table with the following data:
[Java]View PlainCopy
- Hive> SELECT * FROM Employees;
- Ok
- lavimer 15000.0 [ "Li", "K1": 1.0, "K2": 2.0, "K3": 3.0} { "street": "Dingnan", "City": 101} 2015-01-24 love
- Liao 18000.0 [ "Liu", "K4": Span class= "number" >2.0, "K5": 3.0, "K6": 6.0} { "Dingnan", "City": 102 } 2015-01-24 love
- Zhang 19000.0 ["Xiao","Wen","Tian"] {"K7":7.0,"K8":8.0,"K8": 8.0} {"St Reet ":" Dingnan "," City ":" Ganzhou "," num ":103} 2015-01-
Now I want to sort by the second column (salary) in descending order:
[Java]View PlainCopy
- Hive> SELECT * FROM Employees ORDER BY salary Desc;
- The process of executing the MapReduce
- Job 0:map: 1 Reduce: 1 Cumulative CPU: 2.62 sec HDFs Read: 415 HDFS Write: 245 SUCCESS
- Total MapReduce CPU time spent: 2 seconds 620 msec
- Ok
- Zhang 19000.0 [ "K7": 7.0, "K8": 8.0} { " Street ": " Dingnan ", " Ganzhou ", "num": 103} 2015-01-< Span class= "number" >24&NBSP;&NBSP;LOVE&NBSP;&NBSP;
- liao 18000.0 [ "K4": 2.0, "K5": 3.0, "K6": 6.0} { "street": "Dingnan" , 102} 2015-01-24 love
- Lavimer 15000.0 ["Li","Lu","Wang"] {"K1":1.0,"K2":2.0,"K3":3.0} {" Street ":" Dingnan "," City ":" Ganzhou "," num ":101} 2015-01-
- Time taken: 20.484 seconds
- Hive>
The Hive.mapred.mode property at this time is:
[Java]View PlainCopy
- Hive> set Hive.mapred.mode;
- Hive.mapred.mode=nonstrict
- Hive>
Now let's change it to strict and then use order by to query:
[Java]View PlainCopy
- Hive> set hive.mapred.mode=strict;
- Hive> SELECT * FROM Employees ORDER BY salary Desc;
- Failed:error in semantic Analysis: 1:at strict mode, if ORDER by was specified, LIMIT must also be specifie D. Error encountered near token ' salary '
- Hive>
Note: Queries must be prefixed with the Limit keyword in strict mode.
[Java]View PlainCopy
- Hive> SELECT * FROM Employees ORDER BY salary desc limit 3;
- Failed:error in semantic analysis:no partition predicate found for Alias "Employees" Table "Employees"
Note: Another thing to note is that the strict mode also restricts the query for partitioned tables, and the solution is to specify the partition
First take a look at the partition:
[Java]View PlainCopy
- Hive> Show partitions employees;
- Ok
- Date_time=2015-01-24/type=love
- Time taken: 0.096 seconds
In strict mode, use the ORDER by query first:
[Java]View PlainCopy
- Hive> SELECT * FROM Employees where partition (date_time=' 2015-01-24 ', type=' love ') Order by salary desc limit 3;
- Failed:parse error:line 1:cannot recognize input near ' partition ' (" date_time ' in expression s Pecification
- hive
- > select * FROM Employees where date_time=' 2015-01-24 ' and type=' love ' ORDER by salary desc limit 3;
- Execute a mapreduce program
- Total MapReduce CPU time spent: 3 seconds 510 msec
- Ok
- zhang 19000.0 [ "Xiao", "K7": 7.0, "K8": 8.0} { " Street ": " Dingnan ", " Ganzhou ", "num": 103} 2015-01-< Span class= "number" >24&NBSP;&NBSP;LOVE&NBSP;&NBSP;
- Liao 18000.0 [ "Liu", "K4": Span class= "number" >2.0, "K5": 3.0, "K6": 6.0} { "Dingnan", "City": 102 } 2015-01-24 love
- Lavimer 15000.0 ["Li","Lu","Wang"] {"K1":1.0,"K2":2.0,"K3":3.0} {" Street ":" Dingnan "," City ":" Ganzhou "," num ":101} 2015-01-
- Time taken: 19.861 seconds
- Hive>
Hive Order by operation