Here list common operations, more refer to https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-Create%2FDrop%2FTruncateTable
Simple Table Creation
Create Table table_name (ID int, dtdontquery string, name string)
Create a partitioned table
Create Table table_name (ID int, dtdontquery string, name string) partitioned by (date string)
A table can have one or more partitions. Each partition exists in a folder directory.
Partitions exist in the table structure in the form of fields. You can view the existence of fields through the describe table command, but this field does not store the actual data content, but only indicates the partition.
In Hive select queries, the entire table content is usually scanned, which consumes a lot of time to do unnecessary work. Sometimes you only need to scan a part of the data in the table, so the partition concept is introduced during table creation. A partition in a table corresponds to a directory in the Table. Partition is an auxiliary query that reduces the query scope, speeds up data retrieval, and manages data according to certain specifications and conditions.
Typical default table Creation
Create Table page_view (viewtime int, userid bigint, page_url string, referrer_url string, IP string comment 'IP address of the user ') comment 'this is the page view table' partitioned by (DT string, country string) row format delimited fields terminated by '\ 001' collection items terminated by' \ 002' map keys terminated by '\ 003' stored as textfile;
The page_view table is created here, with Table comments and a field IP comments. The partition has two columns: DT and country.
The [row format delimited] keyword is used to set the column delimiter supported by the created table when loading data. Different columns are separated by one '\ 001'. Elements of a set (such as array and map) are separated by' \ 002'. Keys and values in map are separated by '\ 002.
The [stored as file_format] keyword is used to set the data type for loading data. The default value is textfile. If the file data is plain text, [stored as textfile] is used. then, the data is directly copied from the local machine to HDFS, and hive can directly identify the data.
Common table Creation
Create TableLogin(Userid bigint, IP string,TimeBigint) partitioned by (DT string) Row format delimited fields terminated'\ T'Stored as textfile;
Create external table
If the data already exists in '/ user/hadoop/warehouse/page_view' of HDFS, if you want to create a table and point to this path, you need to create an external table:
Create external table page_view (viewtime int, userid bigint, page_url string, referrer_url string, IP string comment 'IP address of the user', country string comment 'country of origination ') comment 'this is the staging page view table' row format delimited fields terminated by '\ 054' stored as textfile location'/user/hadoop/warehouse/page_view ';
When you create a table, the specified external is an external table, but not an internal table. When you drop an internal table, the data is deleted from HDFS, while the external table is not deleted.
The External table can have partitions like the internal table. If a partition is specified, you must modify the table to add partitions after the External table is created.
If a partition exists in an external table, data can be loaded to overwrite the partition data. However, if a partition is deleted from an external table, the data in the corresponding partition is not deleted from HDFS, but the partition data is deleted from the internal table.
Create a table for the specified database
If no database is specified, hive creates the table under the default database. Assume that there is a hive database mydb, and you need to create the table to mydb, as shown below:
Create Table mydb. Pokes (FOO int, bar string); or use mydb; -- direct the current database to mydbcreate table pokes (FOO int, bar string );
Copy table structure
Create Table empty_table_name like table_name;
Create an empty table empty_table_name Based on table_name. empty_table_name has no data.
Create-table-as-selectt (CTAs)
The table created by CTAs is atomic. This means that other users can see the complete query result table until all the query results are complete.
The unique limitation of CTAs is the target table. It cannot be a partitioned table or an external table.
Simple Method
Create Table new_key_value_store
Select (Key % 1024) new_key, Concat (Key, value) key_value_pair from key_value_store;
Complex methods
Create Table new_key_value_store row format serde "org. apache. hadoop. hive. serde2.columnar. columnarserde "stored as rcfile asselect (Key % 1024) new_key, Concat (Key, value) key_value_pairfrom key_value_storesort by new_key, key_value_pair;
Delete table
Drop table table_name; drop table if exists table_name;
Deleting a table removes the metadata and data of the table, and the data on HDFS is moved to the. Trash/current directory if trash is configured.
When an External table is deleted, the data in the table is not deleted.
Truncation table
Truncate table table_name; truncate table table_name partition (Dt = '20140901 ');
Deleting all rows from a table or table partition. If no partition is specified, all partitions in the table are truncated. You can also specify multiple partitions at a time to cut off multiple partitions.