Hive Learning Modify tables, partitions, columns

Source: Internet
Author: User
Tags hadoop fs
The statement that modifies the table allows you to change the structure of an existing table by adding columns/partitions, modifying Serde, adding properties to tables and Serde, or renaming tables. Similarly, modifying a partition's statement can change the properties of the specified partition. Renaming a table

The statements for renaming a table are as follows:

ALTER TABLE table_name RENAME to New_table_name

Modify Table Properties

The statements that modify table properties are as follows:

ALTER TABLE table_name SET tblproperties (property_name = property_value, property_name = Property_value,...)

Use this statement to increase the metadata of the table, Last_modified_by, Last_modified_time properties are automatically added and managed, and you can query the new table properties using describe EXTENDED table_name. Modify a table comment

To modify a table's comments, you only need to specify Property_name as the ' comment ' property using the Modified Table Property statement described above:

ALTER TABLE table_name SET tblproperties (' comment ' = new_comment);

Add Serde Property

The statement to increase the Serde property is as follows:

ALTER TABLE TABLE_NAME SET SERDE serde_class_name [with Serdeproperties serde_properties]
ALTER TABLE table_name SET S Erdeproperties
(property_name = Property_value,property_name = Property_value, ...)

This statement allows the addition of custom metadata to the Serde object. The Serde property is passed to the table Serde when Serde is initialized by hive. Modifying the storage properties of a table

ALTER TABLE table_name Clusteredby (col_name, col_name, ...) [SORTED by (Col_name, ...)] into Num_buckets buckets

This statement alters the physical storage properties of the table.

Note that when the statement above modifies the table's hive metadata only, does not reorganize or reformat the existing data, the user needs to determine the actual data layout that conforms to the metadata definition. New Partition

The statements for the new partition are:

ALTER tabletable_name ADD [IF not EXISTS] PARTITION partition_spec[location ' location1 '] partition_spec [location ' Locati On2 ']
 
... Partition_spec:
  :(partition_col = partition_col_value, Partition_col = Partiton_col_value, ...)

In particular, the following example will fail with no error, regardless of which partition is specified, all queries will be executed on partition dt= ' 2008-08-08 ':

ALTER TABLE page_view ADD PARTITION (dt= ' 2008-08-08 ', country= ' us ') location
'/path/to/us/part080808 ' PARTITION (dt = ' 2008-08-09 ', country= ' us ') location '/path/to/us/part080809 ';

Assuming that the table cannot have a partition and that a new partition is being performed, the error message is indicated:

hive> alter tabletable_properties Add partition (address= ' China ');
Failed:semanticexception table isn't partitioned but partition spec Exists:{address=china}

This is because when you create a new table, the partition column address is not created, so you will only succeed if you perform an additional partition on a table that has a partitioned column. Renaming Partitions

The statements for renaming a partition are as follows:

ALTER TABLE table_name PARTITION partition_spec RENAME to PARTITION Partition_spec;
Partition_spec:
  :(partition_col = partition_col_value, Partition_col = Partiton_col_value, ...)

The sample code is as follows:

Hive> showpartitions people;
OK
department=1/sex=0/howold=23 time
taken:0.349 seconds, fetched:1 row (s)
hive> altertable People Partition (department= ' 1 ', sex= ' 0 ', howold=23) Rename to partition (department= ' 2 ', sex= ' 1 ', howold=24);         
OK time
taken:2.005 seconds
hive> showpartitions people;
OK
department=2/sex=1/howold=24 time
taken:0.271 seconds, fetched:1 row (s)

Swap partition

The statements for the swap partition are as follows:

ALTER tabletable_name_1 EXCHANGE PARTITION (partition_spec) with TABLE table_name_2;

This statement allows the data in one partition to be moved to another table that has the same schema but does not have that partition. recovery partition (Msckrepair TABLE)

Hive stores a partitioned list for each table in the meta-store, but if the new partition is added directly to HDFs (using Hadoop fs–put), Hive will not know these partitions unless alter TABLETABLE_NAME add is executed on each newly added partition Partition command. To avoid repeating the above command, you can use the following command:

MSCK REPAIR TABLE table_name;

The statement adds the partitions that exist on HDFs but not on the meta-store to the meta-store, as shown in the following example:

Hive> dfs-mkdir/user/hive/warehouse/learning.db/people/department=1/sex=0/howold=23;
Hive> show partitions people;
OK
department=2/sex=1/howold=24 time
taken:0.192 seconds, fetched:1 row (s)
hive> msck Repair Table People;
OK
partitions notin metastore:     people:department=1/sex=0/howold=23
repair:addedpartition to Metastore people:department=1/sex=0/howold=23 time
taken:0.943 seconds, Fetched:2 row (s)
hive> show Partitions People;
OK
department=1/sex=0/howold=23
department=2/sex=1/howold=24 time
taken:0.397 seconds, Fetched:2 Row (s)

Delete Partition

The statement to delete the partition is:

ALTER TABLE table_name DROP [IF EXISTS] PARTITION partition_spec,partition partition_spec,...

You can use the preceding statement to delete a partition of a table that will delete the data and metadata for the specified partition. For tables subject to No DROP cascade, you can use ignore protection to delete the specified partition or a component area, as follows:

ALTER TABLE table_name DROP [IF EXISTS] PARTITION partition_spec IGNORE PROTECTION;

An example of deleting a partition is as follows, as you can see from this example that the error message is not prompted when deleting a partition that does not exist.

Hive> show partitions people;
OK
department=1/sex=0/howold=23
department=2/sex=1/howold=24 time
taken:0.397 seconds, Fetched:2 Row (s)
hive> ALTER TABLE people drop partition (department= ' 2 ', sex= ' 2 ', howold=24);
OK time
taken:1.596 seconds
hive> Show partitions people;
OK
department=1/sex=0/howold=23
department=2/sex=1/howold=24 time
taken:0.227 seconds, Fetched:2 Row (s)
hive> ALTER TABLE people drop partition (department= ' 2 ', sex= ' 1 ', howold=24);
Dropped the partition department=2/sex=1/howold=24
OK time
taken:2.267 seconds
hive> Show partitions people;                                              
OK
department=1/sex=0/howold=23 time
taken:0.191 seconds, fetched:1 row (s)

solution/Archive partition

ALTER TABLE table_name ARCHIVE PARTITION partition_spec;
ALTER TABLE table_name unarchive PARTITION Partition_spec;

Archive in hive moves files in the partition into Hadoop archive (HAR), which only reduces the number of files but does not provide compression. Modify the file format of a table/partition

ALTER TABLE table_name [PARTITION Partitionspec] SET fileformat File_format

You can use the above statement to modify the file format of a table or partition, and the file formats supported by Hive are: Sequencefile, Textfile, Rcfile, orc, and InputFormat Input_format_ Classnameoutputformat Output_format_classname, the default file format is textfile, specified by the configuration parameters Hive.default.fileformat. Textfile refers to storing data in a plain text file, using Sequencefile when the data needs to be compressed, using InputFormat and OutputFormat to specify the class name of the input format and output format, such as ' org '. Apache.hadoop.hive.contrib.fileformat.base64.Base64TextInputFormat '. Modify the file location of a table/partition

ALTER TABLE table_name [PARTITIONPARTITIONSPEC] SET location "NewLocation"

Modify the touch of a table/partition

Touch read meta data and write back. This can trigger the execution of the predecessor or the latter hook, assuming there is a record table or a partition modified hook and an external script that directly modifies the file on HDFs. Because the external script modifies the file outside of hive, the modification is not recorded by the hook, which is an external script that can invoke touch to trigger the hook and then mark the table or partition as modified. The touch statement that modifies the table or partition is as follows:

ALTER TABLE table_name TOUCH [PARTITION Partitionspec];

Modify table/partition protection

Data protection can be set at the table level or at the partition level. Enabling No_drop will protect the table or partition from being deleted, and enabling offline will prevent the data in the table or partition from being queried, but metadata can still be accessed. If any partition in the table has No_drop enabled, the table cannot be deleted. The statements that modify the table or partition protection are as follows:

ALTER TABLE table_name [PARTITION Partition_spec] enable| DISABLE No_drop;
ALTER TABLE table_name [PARTITION Partition_spec] enable| DISABLE OFFLINE;

The demo code and the results are as follows:

Hive> ALTER TABLE IIS enable No_drop;
OK time
taken:0.792seconds
hive> drop Tableiis;
Failed:executionerror, return code 1 from Org.apache.hadoop.hive.ql.exec.DDLTask. Table IIS isprotected from being dropped
hive> alter TABLEIIS enable offline;
OK time
taken:0.439seconds
hive> Select *from IIS;
failed:semanticexception [Error 10113]: Query against an offline table or partitiontable IIS
hive> alter TABLEPEOPL E partition (department= ' 1 ', sex= ' 0 ', howold=23) enable No_drop;
OK time
taken:1.23 seconds
hive> drop table people;
Failed:executionerror, return code 1 from Org.apache.hadoop.hive.ql.exec.DDLTask. Table peoplepartitiondepartment=1/sex=0/howold=23 is protected from being dropped

Modify Column name/type/location/comment

The following statement allows you to modify the column name, column type, column comment, and column position. The statement modifies the hive metadata only, does not touch the data in the table, and the user needs to determine the actual data layout that conforms to the metadata definition.

ALTER TABLE table_name Change [COLUMN] col_old_name col_new_name column_type [commentcol_comment] [First| ( After column_name)]

The sample code and results are as follows:

Hive> ALTER TABLE people change telephone mobile string comment ' Change column name ' first;                                                                    
OK time Taken:0.66seconds hive>describe people;                                       
OK Mobile String Changecolumn name name string                                       
Age int Birthday Date Address String Department Stri              ng Sex String howold int # partitioninformation # col_name Data_typ                   E Comment Department string sex                             String          
Howold int  

Add/Replace columns

Add or replace a column with the following statement, where add columns adds a new column after the existing column but before the partition column, REPLACE columns first deletes the existing column and then adds the new column. The replacement column can only be used when the table is using Serde (Dynamicserde,metadatatypedcolumnsetserde, Lazysimpleserde, and Columnarserde).

ALTER TABLE table_name add| REPLACE COLUMNS (col_name data_type[comment col_comment], ...)

REPLACE columns can also be used to delete columns, for example:

Hive> ALTER TABLE Test add columns (e int comment ' Add new column E ');
OK time Taken:0.395seconds hive> describetest;                                         
OK a int b int c int e int Ad          
d new column e D int # partitioninformation                                         
# col_name data_type comment D int Time Taken:0.209seconds, Fetched:10 row (s) hive> ALTER TABLE test Replace column
S (A1 string, B1 string, c1 string);
OK time taken:0.994seconds hive> describe test;                                       
OK A1 String B1 string               C1 string                        
d int # Partitioninforma tion # col_name data_type comment D i NT Time Taken:0.232seconds, Fetched:9 row (s)

It is important to note that the deletion modifies only the schema of the table, but does not delete the data.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.