Learning DDL Statements
Creating statements for objects
Create/drop/alter Database
Create Database
CREATE (DATABASE|SCHEMA) [IF NOT EXISTS] database_name [COMMENT database_comment] [LOCATION hdfs_path] ...)];
Drop Database
DROP (DATABASE|SCHEMA) [IF EXISTS] database_name [RESTRICT|CASCADE];
Alter Database
ALTER (DATABASE|SCHEMA) database_name SET DBPROPERTIES (property_name=property_value, ...);ALTER (DATABASE|SCHEMA) database_name SET OWNER [USER|ROLE] user_or_role;
Use Database
Use database_name;
Use DEFAULT;
When Hive is running, the metadata is stored in the relational database.
When Hive is running, it needs to have mapped data that needs to be read quickly
Linux actually has its own relational database, but it's very unstable, so we don't use this database
We build a relational database ourselves.
Install a relational database (MySQL)
We had MySQL installed when we installed Linux.
Start MySQL
Check to see if MySQL is listening
Port 3306, yes.
Connect to MySQL
Restricted we need to drive
Set up remote login issues in MySQL
Input use MySQL
select * from user;
Grant all on . to [email protected] '% ' identified by ' 123456 ';
This is all the permissions for all the tables that are given to all the users on all the databases, the password is 123456
Check to see if it's successful
Modifying a configuration file
Configure MySQL Path
Modify user name and password
We create a hive database.
Into Hive
Launch Hive after launch
Quite;
Then view the table in MySQL
Exit
Learn the DDL statements for Hive
Create Table
CREATE [temporary] [EXTERNAL] TABLE [IF not EXISTS] [db_name.] TABLE_NAME--(Note:temporary availableinchHive0.14. 0and later) [(Col_name data_type [COMMENT col_comment],...] [COMMENT table_comment] [partitioned by (Col_name data_type [COMMENT col_comment],...)] [CLUSTERED by (Col_name, Col_name,...) [SORTED by (Col_name [asc| DESC],...] into num_buckets buckets] [skewed by (Col_name, Col_name,...)--(note:availableinchHive0.10. 0and later)] on (Col_value, Col_value,...), (Col_value, Col_value,...),...) [STORED as directories] [[ROW FORMAT Row_format] [STORED as File_format] | STORED by' Storage.handler.class.name '[With Serdeproperties (...)]--(note:availableinchHive0.6. 0and later)] [location Hdfs_path] [Tblproperties (Property_name=property_value,...)]--(note:availableinchHive0.6. 0and later) [as select_statement]; --(note:availableinchHive0.5. 0and later; Not supported forExternal tables) CREATE [temporary] [external] TABLE [IF not EXISTS] [db_name.] table_name like Existing_table_or_view_name [location Hdfs_path];d Ata_type:primitive_type | Array_type | Map_type | Struct_type | Union_type--(note:availableinchHive0.7. 0and later) Primitive_type:tinyint | SMALLINT | INT | BIGINT | BOOLEAN | FLOAT | DOUBLE | STRING | BINARY--(note:availableinchHive0.8. 0and later) | TIMESTAMP--(note:availableinchHive0.8. 0and later) | DECIMAL--(note:availableinchHive0.11. 0and later) | DECIMAL (precision, scale)--(note:availableinchHive0.13. 0and later) | DATE--(note:availableinchHive0.12. 0and later) | VARCHAR--(note:availableinchHive0.12. 0and later) | CHAR--(note:availableinchHive0.13. 0and later) Array_type:array < Data_type >map_type:map < Primitive_type, data_type >struct_type:struct < Col_name:data_type [COMMENT Col_comment],...>union_type:uniontype < Data_type, data_type,...>--(note:availableinchHive0.7. 0and later) row_format:delimited [fields TERMINATED by Char [escaped by Char]] [COLLECTION ITEMS TERMINATED by Char] [MAP KEYS TERMINATED by Char] [LINES TERMINATED by Char] [NULLDEFINED as Char]--(note:availableinchHive0.13and later) | SERDE Serde_name [with Serdeproperties (Property_name=property_value, Property_name=property_value,...)]file_format:: Sequencefile | Textfile--(Default, depending on Hive.default.fileformat configuration) | Rcfile--(note:availableinchHive0.6. 0and later) | ORC--(note:availableinchHive0.11. 0and later) | Parquet--(note:availableinchHive0.13. 0and later) | AVRO--(note:availableinchHive0.14. 0and later) | InputFormat input_format_classname OutputFormat Output_format_classname
Example:
id int,date date,name varcharcreate table table_name ( id int, dtDontQuery string, name stringby (date string)
An example
CREATE TABLE page_view(viewTime INT, userid BIGINT,page_url STRING, referrer_url STRING,ip STRING COMMENT ‘IP Address of the User‘)COMMENT ‘This is the page view table‘PARTITIONED BY(dt STRING, country STRING)ROW FORMAT DELIMITEDFIELDS TERMINATED BY ‘\001‘ 这个是分隔符,行的每一列用什么分割STORED AS SEQUENCEFILE;
We create a table
In Hive
create table t_emp(id int,name string,age int,dept_name string)ROW FORMAT DELIMITEDFIELDS TERMINATED BY ‘,‘;
We build a text data file in Linux
Emp.txt
Import data
Loading files into tables
Hive does not does any transformation and loading data into tables. Load operations is currently pure copy/move operations that move datafiles into locations corresponding to Hive tables.
LOAD DATA [LOCAL] inpath ' filepath ' [OVERWRITE] into TABLE tablename [PARTITION (Partcol1=val1, Partcol2=val2 ...)]
Hive finds out where Hadoop is using our environment variables and then connects to Hadoop, creating a working directory of hive on HDFs, under User hive
We're searching, under hive.
Select COUNT (*) from t_emp;
Hive can also use various collection types
intstringarray<string>map<string, string>‘,‘‘_‘map‘:‘;
Data format
1,,zhangsan,sports_books_tv,sex: Male _color:red
Loading files
Load data local inpath ' root/data.exe ' into table T_person
Hive has some metadata to save when it is running. The default is persisted to the DBMS.
Learning DML Statements
Import data
Loading files into tables
‘filepath‘...)]
Create a partitioned table
The partition is actually a folder, and the table name is the folder name. Each partition is actually a different file under the table name in this folder. Partitions can be partitioned based on time, location, and so on, for example, a partition every day, equal to the daily data saved per day, or each city, storing data for each city. Each time the data is queried, it is possible to query the data for a specified time by writing a condition like where pt=2010_08_23
Create table sxtstu (id int, sname string, city String)
Partitioned by (DS string) row format delimited. Terminated by ', ' stored as textfile;
When we save the data,
Load data local inpath ' sxtstu.txt ' overwrite into table sxtstu partition (ds= ' 2013-07-09 ');
Copying data from File:/home/hadoop/sxtstu.txt
Copying File:file:/home/hadoop/sxtstu.txt
Loading data to table Default.sxtstu partition (ds=2013-07-09)
Ok
We try to create a table
create table dept_count( dname string, num int) ;insert into table dept_count select dept_name, count(1) from t_emp group by dept_name;
About partitioning:
Create table dept_count(Num int)Partitioned by (dname string);
Insert into table dept_count partition (dname=‘销售部‘) select count(1) from t_emp where dept_name=‘销售部‘ group by dept_name
Some cases:
create TABLE Students (name varchar (64 ), age int , GPA decimal (3 , 2 )) CLUSTERED by (age) into 2 BUCKETS STORED as ORC; INSERT into TABLE students VALUES (' Fred Flintstone ', 1.28), (' Barney Rubble ', +, 2.32); CREATE TABLE pageviews (userid VARCHAR), link STRING, Came_from STRING) partitioned by (Datestamp STRING) CLUSTERED by (userid) into BUC KETS STORED as ORC;insert INTO Span class= "Hljs-keyword" >table pageviews PARTITION (datestamp = ' 2014-09-23 ' ) values ( ' jsmith ' , ' mail.com ' , ' sports.com ' ), ( ' jdoe ' , ' mail.com ' , null ); INSERT into TABLE pageviews PARTITION (Datestamp) VALUES (' Tjohnson ' , ' sports.com ', ' finance.com ', ' 2014-09-23 '), (' Tlee ', ' finance.com ' ) , null, ' 2014-09-21 ');
About Import and export
EXPORT TABLE tablename [PARTITION (part_column="value"...])] ‘export_target_path‘IMPORT [[EXTERNAL] TABLE new_or_original_tablename [PARTITION (part_column="value"...])]] ‘source_path‘ ‘import_target_path‘]
Export statements
Learning Data Query Statements
Similar SQL statements
Create table t_stu (userid int, name string,age int, sex int, ClassID int) row format delimited terminated by ', 'stored as textfile; Create table t_class (cid int, name String,teacher string)row format Delimited terminated by ', 'stored as textfile; load data inpath '/pub/student.txt ' into table T_stu;
1,zs,32,2,2
2,lis,23,1,2
3,ww,21,1,1
select s.*, c.name from t_stu s join t_class c on s.classid=c.cid;
Copyright NOTICE: This article for Bo Master original article, without Bo Master permission not reproduced.
"Hadoop" 16, learning hive operation statements