Externalkeyword It allows the user to create an external table. The actual data (location) in the specified path is simultaneously constructed in the table. When Hive creates an internal table. The data is moved to the path that the data warehouse points to, and if you create an external table, only the path where the data is located, the location of the incorrect data is changed. When you delete a table, the metadata and data for the internal table are deleted together. The external table simply removes the metadata and does not delete the data
1. Like agrees that the user replicates the existing table structure, but does not replicate the data
2. Use of Regexserde in hive
The Regexserde is a serialization/deserialization of the hive itself, which is mainly used to handle the normal form.
CreateTable Test_serde (
C0string,
C1string,
c2string)
Rowformat
SERDE ' Org.apache.hadoop.hive.contrib.serde2.RegexSerDe '
Withserdeproperties
(' Input.regex ' = ' ([^]*) ([^]*) ([^]*) ',
' output.format.string ' = '%1 $ s%2$s%3$s ')
Storedas textfile;
3. Table and column names do not distinguish between uppercase and lowercase
4. Create an external table to specify the data storage path
Create externaltable Exter_trl (
Id int,
Name String,
Age int,
Tel string
Location '/user/data/trl/external ';
)
1. Import the data into the external table. The data is not moved to its own Data Warehouse folder, which means that the data in the external table is not managed by itself!
And the table is not the same;
2, when the table is deleted. Hive will delete all metadata and data that are part of the table. While deleting the external table. Hive deletes only the metadata for the external table, and the data is not deleted.
So, how do you choose which kind of table to use? In most cases there is not much difference, so the choice is just a matter of personal preference.
But as an experience, assuming that all processing needs to be completed by hive, then you should create the table, otherwise use the external table.
5. Loading data into a specified partition
LoadData Inpath
'/user/data/clickstat_gp_fatdt0/0 ' Overwriteinto TABLE c02_clickstat_fatdt1
PARTITION (dt= ' 20140820 ');
5. The synchronization of the built table specifies the path strength of HDFs
CREATE externaltable Page_view (viewtime INT, UserID BIGINT,
Page_url String,referrer_url STRING,
IP stringcomment ' ip Address of the User ',
Country stringcomment ' country of origination ')
COMMENT ' This isthe staging page view table '
ROW formatdelimited fields TERMINATED by ' \054 '
STORED Astextfile
Location '
6. HIVE View(view reduces complex queries)
CREATE VIEW Test_trlas
SELECT * from T1join T2
On (t1.id=t2.id) WHERE t1.name= ' TRL ';
Copyright notice: This article Bo Master original articles, blogs, without consent may not be reproduced.