"Hadoop" 16, learning hive operation statements

Last Update:2015-06-30 Source: Internet

Author: User

Tags dname

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Learning DDL Statements

Creating statements for objects

Create/drop/alter Database

Create Database

CREATE (DATABASE|SCHEMA) [IF NOT EXISTS] database_name  [COMMENT database_comment]  [LOCATION hdfs_path]  ...)];

Drop Database

DROP (DATABASE|SCHEMA) [IF EXISTS] database_name [RESTRICT|CASCADE];

Alter Database

ALTER (DATABASE|SCHEMA) database_name SET DBPROPERTIES (property_name=property_value, ...);ALTER (DATABASE|SCHEMA) database_name SET OWNER [USER|ROLE] user_or_role;

Use Database
Use database_name;
Use DEFAULT;

When Hive is running, the metadata is stored in the relational database.

When Hive is running, it needs to have mapped data that needs to be read quickly

Linux actually has its own relational database, but it's very unstable, so we don't use this database

We build a relational database ourselves.

Install a relational database (MySQL)

We had MySQL installed when we installed Linux.

Start MySQL

Check to see if MySQL is listening

Port 3306, yes.

Connect to MySQL

Restricted we need to drive

Set up remote login issues in MySQL

Input use MySQL
select * from user;

Grant all on . to [email protected] '% ' identified by ' 123456 ';
This is all the permissions for all the tables that are given to all the users on all the databases, the password is 123456

Check to see if it's successful

Modifying a configuration file

Configure MySQL Path

Modify user name and password

We create a hive database.

Into Hive

Launch Hive after launch

Quite;

Then view the table in MySQL

Exit

Learn the DDL statements for Hive

Create Table

CREATE [temporary] [EXTERNAL] TABLE [IF not EXISTS] [db_name.] TABLE_NAME--(Note:temporary availableinchHive0.14. 0and later) [(Col_name data_type [COMMENT col_comment],...] [COMMENT table_comment] [partitioned by (Col_name data_type [COMMENT col_comment],...)] [CLUSTERED by (Col_name, Col_name,...) [SORTED by (Col_name [asc| DESC],...] into num_buckets buckets] [skewed by (Col_name, Col_name,...)--(note:availableinchHive0.10. 0and later)] on (Col_value, Col_value,...), (Col_value, Col_value,...),...) [STORED as directories] [[ROW FORMAT Row_format] [STORED as File_format] | STORED by' Storage.handler.class.name '[With Serdeproperties (...)]--(note:availableinchHive0.6. 0and later)] [location Hdfs_path] [Tblproperties (Property_name=property_value,...)]--(note:availableinchHive0.6. 0and later) [as select_statement]; --(note:availableinchHive0.5. 0and later; Not supported forExternal tables) CREATE [temporary] [external] TABLE [IF not EXISTS] [db_name.] table_name like Existing_table_or_view_name [location Hdfs_path];d Ata_type:primitive_type | Array_type | Map_type | Struct_type | Union_type--(note:availableinchHive0.7. 0and later) Primitive_type:tinyint | SMALLINT | INT | BIGINT | BOOLEAN | FLOAT | DOUBLE | STRING | BINARY--(note:availableinchHive0.8. 0and later) | TIMESTAMP--(note:availableinchHive0.8. 0and later) | DECIMAL--(note:availableinchHive0.11. 0and later) | DECIMAL (precision, scale)--(note:availableinchHive0.13. 0and later) | DATE--(note:availableinchHive0.12. 0and later) | VARCHAR--(note:availableinchHive0.12. 0and later) | CHAR--(note:availableinchHive0.13. 0and later) Array_type:array < Data_type >map_type:map < Primitive_type, data_type >struct_type:struct < Col_name:data_type [COMMENT Col_comment],...>union_type:uniontype < Data_type, data_type,...>--(note:availableinchHive0.7. 0and later) row_format:delimited [fields TERMINATED by Char [escaped by Char]] [COLLECTION ITEMS TERMINATED by Char] [MAP KEYS TERMINATED by Char]        [LINES TERMINATED by Char] [NULLDEFINED as Char]--(note:availableinchHive0.13and later) | SERDE Serde_name [with Serdeproperties (Property_name=property_value, Property_name=property_value,...)]file_format:: Sequencefile | Textfile--(Default, depending on Hive.default.fileformat configuration) | Rcfile--(note:availableinchHive0.6. 0and later) | ORC--(note:availableinchHive0.11. 0and later) | Parquet--(note:availableinchHive0.13. 0and later) | AVRO--(note:availableinchHive0.14. 0and later) | InputFormat input_format_classname OutputFormat Output_format_classname

Example:

id     int,date   date,name   varcharcreate table table_name (  id                int,  dtDontQuery       string,  name              stringby (date string)

An example

CREATE TABLE page_view(viewTime INT, userid BIGINT,page_url STRING, referrer_url STRING,ip STRING COMMENT ‘IP Address of the User‘)COMMENT ‘This is the page view table‘PARTITIONED BY(dt STRING, country STRING)ROW FORMAT DELIMITEDFIELDS TERMINATED BY ‘\001‘  这个是分隔符，行的每一列用什么分割STORED AS SEQUENCEFILE;

We create a table

In Hive

create table t_emp(id int,name string,age int,dept_name string)ROW FORMAT DELIMITEDFIELDS TERMINATED BY ‘,‘;

We build a text data file in Linux
Emp.txt

Import data

Loading files into tables
Hive does not does any transformation and loading data into tables. Load operations is currently pure copy/move operations that move datafiles into locations corresponding to Hive tables.

LOAD DATA [LOCAL] inpath ' filepath ' [OVERWRITE] into TABLE tablename [PARTITION (Partcol1=val1, Partcol2=val2 ...)]

Hive finds out where Hadoop is using our environment variables and then connects to Hadoop, creating a working directory of hive on HDFs, under User hive

We're searching, under hive.

Select COUNT (*) from t_emp;

Hive can also use various collection types

intstringarray<string>map<string, string>‘,‘‘_‘map‘:‘;

Data format
1,,zhangsan,sports_books_tv,sex: Male _color:red

Loading files
Load data local inpath ' root/data.exe ' into table T_person

Hive has some metadata to save when it is running. The default is persisted to the DBMS.

Learning DML Statements

Import data
Loading files into tables

‘filepath‘...)]

Create a partitioned table

The partition is actually a folder, and the table name is the folder name. Each partition is actually a different file under the table name in this folder. Partitions can be partitioned based on time, location, and so on, for example, a partition every day, equal to the daily data saved per day, or each city, storing data for each city. Each time the data is queried, it is possible to query the data for a specified time by writing a condition like where pt=2010_08_23

Create table sxtstu (id int, sname string, city String)
Partitioned by (DS string) row format delimited. Terminated by ', ' stored as textfile;

When we save the data,
Load data local inpath ' sxtstu.txt ' overwrite into table sxtstu partition (ds= ' 2013-07-09 ');

Copying data from File:/home/hadoop/sxtstu.txt
Copying File:file:/home/hadoop/sxtstu.txt

Loading data to table Default.sxtstu partition (ds=2013-07-09)
Ok
We try to create a table

create table dept_count(     dname string,     num int)     ;insert into table dept_count select dept_name,  count(1) from t_emp group by dept_name;

About partitioning:

Create table dept_count(Num int)Partitioned by (dname string);

Insert into table dept_count partition (dname=‘销售部‘) select count(1) from t_emp where dept_name=‘销售部‘ group by dept_name

Some cases:

create  TABLE  Students (name varchar  (64 ), age int , GPA decimal  (3 , 2 )) CLUSTERED by  (age)  into  2  BUCKETS STORED as  ORC;  INSERT  into TABLE students VALUES (' Fred Flintstone ',  1.28), (' Barney Rubble ', +, 2.32); CREATE TABLE pageviews (userid VARCHAR), link STRING, Came_from STRING) partitioned  by (Datestamp STRING) CLUSTERED by (userid)   into BUC  KETS STORED  as ORC;insert  INTO  Span class= "Hljs-keyword" >table  pageviews PARTITION (datestamp =  ' 2014-09-23 ' ) values  ( ' jsmith ' ,  ' mail.com ' ,  ' sports.com ' ), (  ' jdoe ' ,  ' mail.com ' , null );  INSERT  into TABLE pageviews PARTITION (Datestamp) VALUES (' Tjohnson ' , ' sports.com ', ' finance.com ', ' 2014-09-23 '), (' Tlee ', ' finance.com ' ) , null, ' 2014-09-21 ');

About Import and export

EXPORT TABLE tablename [PARTITION (part_column="value"...])]  ‘export_target_path‘IMPORT [[EXTERNAL] TABLE new_or_original_tablename [PARTITION (part_column="value"...])]]  ‘source_path‘  ‘import_target_path‘]

Export statements

Learning Data Query Statements

Similar SQL statements

 Create table t_stu (userid int, name string,age int, sex int, ClassID int) row format delimited terminated  by ', 'stored as  textfile; Create table t_class (cid int, name String,teacher string)row format Delimited terminated  by ', 'stored as  textfile; load data inpath '/pub/student.txt '  into table T_stu;

1,zs,32,2,2
2,lis,23,1,2
3,ww,21,1,1

select s.*, c.name from t_stu s join t_class c on s.classid=c.cid;

"Hadoop" 16, learning hive operation statements

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More