DML mainly operates on the data in the Hive table, but because of the characteristics of Hadoop, the performance of a single modification and deletion is very low, so it does not support the level operation;
Mainly describes the most common methods of BULK INSERT data:
1. Loading data from a file
Syntax: LOAD [LOCAL] ' filepath ' [OVERWRITE] into TABLE [PARTITION (Partcol1=val1, partcol2=val2 ...) ]
Cases:
Load ' /opt/data.txt ' into Table table1; --If the file is stored in HDFs, you do not need to write the local
2. Inserting data from other tables
syntax: Standard syntax:INSERTOVERWRITETABLETableName1[PARTITION (Partcol1=val1, Partcol2=val2 ...) [IF not EXISTS]] Select_statement1 fromfrom_statement;INSERT into TABLETableName1[PARTITION (Partcol1=val1, partcol2=val2 ...)]Select_statement1 fromfrom_statement; Hive extension (multiple inserts): fromfrom_statementINSERTOVERWRITETABLETableName1[PARTITION (Partcol1=val1, Partcol2=val2 ...) [IF not EXISTS]] Select_statement1[INSERT OVERWRITE TABLE tablename2 [PARTITION ... [IF not EXISTS]] Select_statement2][INSERT into TABLE tablename2 [PARTITION ...]Select_statement2] ...; fromfrom_statementINSERT into TABLETableName1[PARTITION (Partcol1=val1, partcol2=val2 ...)]Select_statement1[INSERT into TABLE tablename2 [PARTITION ...]Select_statement2][INSERT OVERWRITE TABLE tablename2 [PARTITION ... [IF not EXISTS]] Select_statement2] ...; Hive extension (Dynamic partition Inserts):INSERTOVERWRITETABLETableName PARTITION (partcol1[=val1], Partcol2[=val2]...) Select_statement fromfrom_statement;INSERT into TABLETableName PARTITION (partcol1[=val1], Partcol2[=val2]...) Select_statement fromFrom_statement;
Cases:
from Page_view_stg PVs INSERT TABLE page_view PARTITION (dt='2008-06-08', country) SELECTnullnull, Pvs.ip, pvs.cnt
Hive 6, Hive DML (Data manipulation Language)