Create an external table with partitions
The advantage of creating an external table is that the data can be mounted to the table from hdfs at any time.
Partitioning can shorten the query range.
The following example shows how to create an external table.
Create external table my_daily_report (last_update string, col_a string, col_ B string, col_c string, col_d string, col_e string, col_f string, col_g string, col_h string, col_ I string, col_j string) partitioned by (par_dt string) location '/user/chenshu/data/daily ';
Mount a partition folder
Alter table my_daily_report add partition (par_dt = '000000') location '/user/chenshu/data/daily/my_daily_report/100 ';
In the preceding example, only one partition is used. In fact, multiple partitions can be used. For example, if a partition is used for daily report management, the partition corresponds to a folder and can have hour partitions under this folder, use different folders to store reports for different hours. In this case, the relationship between partitions is the relationship between the folder tree.
Delete partition
Of course, you also need to provide a method to delete the part_dt = '000000' partition:
Alter table my_daily_report drop partition (par_dt = '000000 ')
Drop partition deletes all partitions and data. drop partition_spec only deletes partition metadata and does not delete data.
Note: There is no delete from statement in HIVE. If you only delete all statements in a partition, you can use drop partition here.
Query by partition
Now that you have a partition, it is faster to find the data in the partition and specify the partition folder as the query condition in the where clause.
Select count (*) from my_daily_report where par_dt = '000000 ';
Recommended articles:
Http://my.oschina.net/leejun2005/blog/82065
Hive creates external tables and partitions