Data types of Hive
1. Basic types
Integer type: Tinyint,smallint,int,bigint, the above types correspond to Byte,short,int,long in Java respectively.
Decimal type: float,double
Boolean Type: Boolean
String type: String,varchar,char
2. Complex types
Array,map,struct,timestamp (DATE)
Commonly used is int, string,float.
II. Database DDL Operations
1. Create a database
use database_name;
III. DDL operations for data tables
1. Create a datasheet
CREATE [temporary] [EXTERNAL] TABLE [IF not EXISTS] [db_name.] TABLE_NAME-(Note:temporary available in Hive 0.14.0 and later) [Col_name data_type [COMMENT col_comment], ... [Constraint_specification]] [COMMENT Table_comment] [Partitioned by (Col_name data_type [COMMENT col_comment], ...)] [CLUSTERED by (Col_name, col_name, ...) [SORTED by (Col_name [asc| DESC], ...) into num_buckets buckets] [skewed by (Col_name, Col_name, ...)-(note:available in Hive 0.10.0 and later)] on (Col_value, Col_value, ...), (Col_value, Col_value, ...), ...) [STORED as directories] [[ROW FORMAT Row_format] [STORED as File_format] | STORED by ' Storage.handler.class.name ' [with serdeproperties (...)]--(note:available in Hive 0.6.0 and later)] [LOCATION Hdfs_path] [Tblproperties (Property_name=property_value, ...)]--(note:available in Hive 0.6.0 and later) [as Select_ Statement]; --(Note:available in Hive 0.5.0 and later; not keyword for external tables)
This is the format for creating the data table given by the official document. No more explanations.
Explain a few key words:
Create an internal table: Hive will move the data file to its managed Data Warehouse directory
Create an external table: you need to add external keywords. Does not move the file, the main difference with the internal datasheet is reflected in loading data and deleting the datasheet.
Create partition: Need to add partitioned by (keyword)
Create bucket: Need to add clustered by (keyword)
Specify the split format for the column: row format delimited [FIELDS terminated by Char [escaped by Char]]
Specify the split format for the row: COLLECTION ITEMS terminated by ' \ n '
Specifies the stored file format: STORED as File_format
Specify where the table is stored: LOCATION hdfs_path
Three ways to create a table:
1. Create a new table normally.
2. Create a Table child table (AS)
Select several of these as the structure of the new table create table IF not EXISTS Default.bf_log_20170727_sa as select Ip,user from default.bf_log_20170727;
3. Copy a table format without table data (like)
CREATE TABLE IF not EXISTS default.bf_log_20170728 like default.bf_log_20170727;
The difference between an internal table and an external table:
① The internal table creation process and the data loading process (which can be done in the same statement), the actual data is moved to the Data Warehouse directory during the loading of the data, and the access to the data will be done directly in the Data Warehouse directory. When you delete a table, the data and metadata in the table are deleted at the same time;
② external tables have only one procedure, loading data and creating tables are done at the same time, not moving into the Data Warehouse directory, but creating a link with external data. Deletes only the link when an external table is deleted;
2. Delete Datasheet
DROP TABLE [IF EXISTS] table_name;
3. Empty datasheet
Do not delete the table's structure, only empty the table's data
TRUNCATE TABLE table_name [PARTITION PARTITION_SPEC]; Partition_spec: (Partition_column = partition_col_value, Partition_column = Partition_col_value, ...)
4. Modify Datasheet
(1). Datasheet rename (rename to)
ALTER TABLE table_name RENAME to new_table_name;
(2). Modify column name (change)
ALTER TABLE table_name [PARTITION partition_spec] Change [COLUMN] col_old_name col_new_name column_type [COMMENT Col_ Comment] [first| After column_name] [cascade| RESTRICT];
(3). Add or Remove Columns (add/replace)
ALTER TABLE table_name [PARTITION Partition_spec]--(Note:hive 0.14.0 and later) add| REPLACE COLUMNS (col_name data_type [COMMENT col_comment], ...) [cascade| RESTRICT]--(Note:hive 1.1.0 and later)