Hive--Index operations

Source: Internet
Author: User
Tags compact create index table name hdfs dfs

Reprint Please specify source: https://blog.csdn.net/l1028386804/article/details/80184742

Indexing is a function of hive0.7, and creating an index requires evaluating its reasonableness because creating an index also requires disk space, which is also a cost to maintain.
Create an index

hive> CREATE INDEX [Index_studentid] on table student (StudentID)
> as ' Org.apache.hadoop.hive.ql.index.compact.CompactIndexHandler '
> with deferred rebuild
> in TABLE index_ table_student;
OK time
taken:15.219 seconds
hive>
Org.apache.hadoop.hive.ql.index.compact.CompactIndexHandler: The implementation class required to create an index
Index_studentid: Index Name
Student: Table name

Index_table_student: Table name after index is created View Index Table

(index_table_student) No data.

Hive> Select*from index_table_student;
OK Time
taken:0.295 seconds
Loading Index data
Hive> ALTER index index_studentid on student rebuild; WARNING:HIVE-ON-MR is deprecated in Hive 2 and may isn't available in the future versions.
Consider using a different execution engine (i.e. Tez, Spark) or using Hive 1.X releases. Query ID = root_20161226235345_5b3fcc2b-7f90-4b10-861f-31cbaed8eb73 Total jobs = 1 Launching Job 1 out of 1 number of redu Ce tasks not specified. Estimated from-input data size:1 in order-to-change the average-load for a-reducer (in bytes): Set Hive.exec.reducers.byt 
Es.per.reducer=<number> in order to limit the maximum number of Reducers:set hive.exec.reducers.max=<number> In order to set a constant number of reducers:set mapreduce.job.reduces=<number> starting job = job_1482824475750_ 0001, Tracking URL = Http://liuyazhuang121:8088/proxy/application_1482824475750_0001/Kill Command =/usr/local/ Development/hadoop-2.6.4/bin/hadoop Job-kill job_1482824475750_0001 Hadoop job information for Stage-1: Number of Mapper s:1; Number ofReducers:1 2018-05-02 23:55:40,317 Stage-1 map = 0, reduce = 0% 2018-05-02 23:56:40,757 Stage-1 map = 0, reduce = 0% 20  18-05-02 23:56:48,768 Stage-1 map = 100%, reduce = 0, Cumulative CPU 2.08 sec 2018-05-02 23:57:34,981 Stage-1 map = 100%,
reduce = 67%, Cumulative CPU 3.66 sec 2018-05-02 23:57:40,716 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 4.68 sec MapReduce Total cumulative CPU time:4 seconds 680 msec Ended Job = job_1482824475750_0001 Loading data to table DEFAULT.I  Ndex_table_student MapReduce Jobs launched:stage-stage-1: map:1 reduce:1 Cumulative cpu:4.68 sec HDFs read:10282 HDFs write:537 SUCCESS Total MapReduce CPU time spent:4 seconds 680 msec OK time taken:280.693 seconds
querying data in an index table
Hive> Select*from index_table_student;
OK
1 hdfs://liuyazhuang121:8020/opt/hive/warehouse/student/sutdent.txt [0]
2 hdfs://liuyazhuang121:8020/ Opt/hive/warehouse/student/sutdent.txt [3]
hdfs://liuyazhuang121:8020/opt/hive/warehouse/student/ Sutdent.txt [[
4] hdfs://liuyazhuang121:8020/opt/hive/warehouse/student/sutdent.txt []
5 hdfs:// Liuyazhuang121:8020/opt/hive/warehouse/student/sutdent.txt [113]
6 hdfs://liuyazhuang121:8020/opt/hive/ Warehouse/student/sutdent.txt [143] Time
taken:2.055 seconds, Fetched:6 row (s)
hive>
View Hdfs://liuyazhuang121:8020/opt/hive/warehouse/student/sutdent.txt
[root@liuyazhuang121 ~]# HDFs dfs-text/opt/hive/warehouse/student/sutdent.txt;
001 0 Beijing xinlang@.com
002 1 shanghaixinlang@.com
003 0 shegzhen xinlang@.com
004 1 Nanjing xinlang@.com< c4/>005 0 Guangdong xinlang@.com
006 1 Hainan xinlang@.com
[root@liuyazhuang121 ~]#
Delete Index
DROP INDEX Index_studentid on student;
View Index
Hive> SHOW INDEX on student;
OK
index_studentid         student               studentid               index_table_student    Compact Time                 
taken:0.487 Seconds, Fetched:1 row (s)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.