Hadoop Learning Notes--11.HIVEDDL operations

Last Update:2017-08-10 Source: Internet

Author: User

Keywords According to the database DDL operation hive data type number HIVEDDL operation Hadoop learning notes

Tags array available in basic comment create creating data data type

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Data types of Hive

1. Basic types
Integer type: Tinyint,smallint,int,bigint, the above types correspond to Byte,short,int,long in Java respectively.
Decimal type: float,double
Boolean Type: Boolean
String type: String,varchar,char
2. Complex types
Array,map,struct,timestamp (DATE)
Commonly used is int, string,float.
　

II. Database DDL Operations

1. Create a database

use database_name;

III. DDL operations for data tables

1. Create a datasheet

CREATE [temporary] [EXTERNAL] TABLE [IF not EXISTS] [db_name.] TABLE_NAME-(Note:temporary available in Hive 0.14.0 and later) [Col_name data_type [COMMENT col_comment], ... [Constraint_specification]] [COMMENT Table_comment] [Partitioned by (Col_name data_type [COMMENT col_comment], ...)] [CLUSTERED by (Col_name, col_name, ...) [SORTED by (Col_name [asc| DESC], ...) into num_buckets buckets] [skewed by (Col_name, Col_name, ...)-(note:available in Hive 0.10.0 and later)] on (Col_value, Col_value, ...), (Col_value, Col_value, ...), ...) [STORED as directories] [[ROW FORMAT Row_format] [STORED as File_format] | STORED by ' Storage.handler.class.name ' [with serdeproperties (...)]--(note:available in Hive 0.6.0 and later)] [LOCATION Hdfs_path] [Tblproperties (Property_name=property_value, ...)]--(note:available in Hive 0.6.0 and later) [as Select_ Statement]; --(Note:available in Hive 0.5.0 and later; not keyword for external tables)

This is the format for creating the data table given by the official document. No more explanations.
Explain a few key words:

Create an internal table: Hive will move the data file to its managed Data Warehouse directory
Create an external table: you need to add external keywords. Does not move the file, the main difference with the internal datasheet is reflected in loading data and deleting the datasheet.
Create partition: Need to add partitioned by (keyword)
Create bucket: Need to add clustered by (keyword)
Specify the split format for the column: row format delimited [FIELDS terminated by Char [escaped by Char]]
Specify the split format for the row: COLLECTION ITEMS terminated by ' \ n '
Specifies the stored file format: STORED as File_format
Specify where the table is stored: LOCATION hdfs_path

Three ways to create a table:
1. Create a new table normally.
2. Create a Table child table (AS)

Select several of these as the structure of the new table create table IF not EXISTS Default.bf_log_20170727_sa as select Ip,user from default.bf_log_20170727;

3. Copy a table format without table data (like)

CREATE TABLE IF not EXISTS default.bf_log_20170728 like default.bf_log_20170727;

The difference between an internal table and an external table:
① The internal table creation process and the data loading process (which can be done in the same statement), the actual data is moved to the Data Warehouse directory during the loading of the data, and the access to the data will be done directly in the Data Warehouse directory. When you delete a table, the data and metadata in the table are deleted at the same time;
② external tables have only one procedure, loading data and creating tables are done at the same time, not moving into the Data Warehouse directory, but creating a link with external data. Deletes only the link when an external table is deleted;

2. Delete Datasheet

DROP TABLE [IF EXISTS] table_name;

3. Empty datasheet
Do not delete the table's structure, only empty the table's data

TRUNCATE TABLE table_name [PARTITION PARTITION_SPEC]; Partition_spec: (Partition_column = partition_col_value, Partition_column = Partition_col_value, ...)

4. Modify Datasheet
(1). Datasheet rename (rename to)

ALTER TABLE table_name RENAME to new_table_name;

(2). Modify column name (change)
　

ALTER TABLE table_name [PARTITION partition_spec] Change [COLUMN] col_old_name col_new_name column_type [COMMENT Col_ Comment] [first| After column_name] [cascade| RESTRICT];

(3). Add or Remove Columns (add/replace)

ALTER TABLE table_name [PARTITION Partition_spec]--(Note:hive 0.14.0 and later) add| REPLACE COLUMNS (col_name data_type [COMMENT col_comment], ...) [cascade| RESTRICT]--(Note:hive 1.1.0 and later)

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More