Basic hive execution statement

Last Update:2014-09-26 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Simple hive concepts

Hive is a hadoop-based Data Warehouse processing tool. Currently, it only supports Simple SQL queries and modification operations similar to traditional relational databases. It can directly convert SQL statements into mapreduce programs, developers do not have to learn to write Mr programs, improving development efficiency.

Example: Based on the hive environment stored in MySQL, hive metadata (hive-related tables, various table field attributes, and other information) is stored in the MySQL database, by default, MySQL data is stored in HDFS/user/hive/warehouse/hive. DB

DDL statement

MySQL is used as the directory of the metadata storage database (hive) structure.

Create a table

Hive> Create Table Test (ID int, name string );

Introduce the concept of partition, because select in hive usually scans the entire table, which will waste a lot of time, so introduce the concept of Partition

Hive> Create Table Test2 (ID int, name string) partitioned by (DS string );

Browse tables

Hive> show tables;

Introducing regular expressions like

Hive> show tables '. * t'

View data structure

Hive> describe test; or DESC test;

Modify or delete a table

Hive> alter table test Rename to test3;

Hive> alter table add columns (new_column type comment 'annotation ')

Hive> drop table test; DML operation statement

1. Import Data

LOAD DATA LOCAL INPATH ‘/home/hadoop/test.txt‘ OVERWRITE INTO TABLE test;

Local indicates local execution. If the file on HDFS is removed by default, overwrite indicates that the imported data is overwritten. If the file is removed, append is used.

2. Execute the query

Select * From Test2 where test2.ds = '2017-08-26'

3. It is worth noting that the select count (*) from test is different from the record query operations in our relational database.

Hive> select count (*) from Test2;
Total mapreduce jobs = 1
Launching job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes ):
Set hive.exe C. reducers. bytes. Per. Cer CER = <number>
In order to limit the maximum number of specified CERs:
Set hive.exe C. Fetch CERs. max = <number>
In order to set a constant number of specified CERs:
Set mapred. Reduce. Tasks = <number>
Starting job = job_1411720827309_0004, tracking url = http: // master: 8031/Proxy/application_1411720827309_0004/
Kill command =/usr/local/cloud/hadoop/bin/hadoop job-kill job_1411720827309_0004
Hadoop job information for stage-1: Number of mappers: 1; number of concurrent CERs: 1
Stage-1 Map = 0%, reduce = 0%
Stage-1 Map = 100%, reduce = 0%, cumulative CPU 0.93 Sec
Stage-1 Map = 100%, reduce = 0%, cumulative CPU 0.93 Sec
Stage-1 Map = 100%, reduce = 0%, cumulative CPU 0.93 Sec
Stage-1 Map = 100%, reduce = 0%, cumulative CPU 0.93 Sec
Stage-1 Map = 100%, reduce = 0%, cumulative CPU 0.93 Sec
Stage-1 Map = 100%, reduce = 0%, cumulative CPU 0.93 Sec
Stage-1 Map = 100%, reduce = 0%, cumulative CPU 0.93 Sec
Stage-1 Map = 100%, reduce = 0%, cumulative CPU 0.93 Sec
Stage-1 Map = 100%, reduce = 100%, cumulative CPU 2.3 Sec
Stage-1 Map = 100%, reduce = 100%, cumulative CPU 2.3 Sec
Mapreduce total cumulative CPU time: 2 secondds 300 msec
Ended job = job_1411720827309_0004
Mapreduce Jobs launched:
Job 0: Map: 1 reduce: 1 Cumulative CPU: 2.3 sec HDFS read: 245 HDFS write: 2 Success
Total mapreduce CPU time spent: 2 secondds 300 msec
OK
3
Time taken: 27.508 seconds, fetched: 1 row (s)

Basic hive execution statement

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Basic hive execution statement

Contact Us

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support