Read Catalogue
- Order
- Write the results of the query to the file system
- Cluster Data Migration One
- Cluster Data Migration II
- Series Index
This article is copyright Mephisto and Blog Park is shared, welcome reprint, but must retain this paragraph statement, and give the original link, thank you for your cooperation.
The article is written by elder brother (Mephisto), Sourcelink
Order
In the previous article, we introduced a variety of ways to import hive data, so that our hive has a data source, but sometimes we may need to pure export, or cluster hive data migration (different clusters, different versions), we can use these two chapters of knowledge to achieve.
Here we begin to describe the data export of hive and the migration of the cluster hive data.
Write the results of the query to the file system
One: Description
The results of the query can be written to the file system by simply modifying the import syntax from other tables in the previous article.
Two: Syntax:
Standard Syntax:INSERTOVERWRITE[LOCAL]DIRECTORY Directory1[ROW FORMAT Row_format] [STORED as File_format](Note: onlyAvailable starting withHive0.11.0) SELECT... from ... Hive extension (multiple inserts): fromfrom_statementINSERTOVERWRITE[LOCAL]DIRECTORY directory1 select_statement1[INSERT OVERWRITE [LOCAL]DIRECTORY directory2 select_statement2] ... row_format:delimited[Fields TERMINATED by Char [escaped by Char]][COLLECTION ITEMS TERMINATED by Char] [MAP KEYS TERMINATED by Char] [LINES TERMINATED by Char] [NULL DEFINED as Char](Note: onlyAvailable starting withHive0.13)
Three: Write to local
If local is used, the data is written to the local
Four: Writing to the cluster
If you do not use local, the data is written to the specified HDFs, and if the full path is not written, Hadoop's configuration Entry fs.default.name (the Namenode URI) is used.
Five: Actual combat
Modify the TMP folder permissions (this is only a test, so use the maximum permissions)
chmod 777 TMP
Into Hive
sudo -u hdfs Hive
Export the score table data from the previous article to a local
Insert Overwrite local directory '/data/tmp/score'select* from score;
We can see that there are files in the/data/tmp/score/directory.
Cd/data/tmp/score
ll
This allows us to export the hive data locally.
Below we use the command without the local parameter to lead the Hive table data into HDFs
Insert Overwrite directory '/data/tmp/score'Select * from score;
We use the LS command in HDFs to view
Hadoop FS-LS /data/tmp/score
There is only one file here, and the above is not the same, but the total content is the same, the same data above the export, and sometimes only one file. It's not a fancy place to be.
Cluster Data Migration One
One: Introduction
In the official website, we can see the export and import, the function from the Hive0.8 began to join.
Two: Export/import
The
Export command exports tables or partitions based on metadata, and the output location can be another Hadoop cluster or hive instance. Tables with partitions are supported. export meta data storage in target directory file stored in subdirectory < Span id= "nohighlight_0.14125266089104116".
The metadata store DBMS that imports the exported source and destination can be a different relational database.
Three: Export syntax
TABLE [PARTITION (part_column= "value" [, ...] ])] to'export_target_path'
IV: Import Syntax
[[EXTERNAL]TABLE[PARTITION (part_column= "value" [, ... ] ]]] [from'source_path'[ Location ' Import_target_path ']
V: official Example
Simple Import Export
Table to ' hdfs_exports_location/department ' from'hdfs_exports_location/department';
Rename Import Export
Table to ' hdfs_exports_location/department ' tablefrom'hdfs_exports_location/department';
Partition Export
table Employee partition (emp_country="in", emp_state=to' Hdfs_exports_location/employee ' from'hdfs_exports_location/employee';
Partition Import
Table to ' Hdfs_exports_location/employee ' Table Employee partition (emp_country="Us", emp_state=from' hdfs_exports_location/employee';
Specify the import location
Table to ' hdfs_exports_location/department ' tablefrom'hdfs_exports_location/department' 'import_target_location/department';
Import as external table
Table to ' hdfs_exports_location/department ' tablefrom'hdfs_exports_location/department';
Cluster Data Migration II
One: Introduction
Although the official Export/import command is very powerful, but in actual use, it may be the version of the different, there will be unable to import the situation, oneself in this piece also pondering, summed up their own set of partitioned Hive table Data Migration scheme, The scheme was successfully migrated in Cloudera and Hontorworks clusters, and the hive version was inconsistent.
Two: Export data
This is used as a data source because the hive version of the Cloudera release version CDH-5.3.3 is less than 0.8.
Create a partitioned table score
Create Table score ( ID int, studentid int, score Double by (Openingtime string);
Import 7, August data based on the way the data was imported in the previous article
Load ' /data/tmp/score_7.txt ' into Table score PARTITION (openingtime=201507);
Refer to our above export to local or put under/data/tmp/score
Insert Overwrite local directory '/data/tmp/score'Select * from score;
Three: Migrating Data
Create a new/data/tmp directory in another cluster
mkdir -p/data/tmp/score
Copy data
SCP /data/tmp/score/* [email protected]:/data/tmp/score/
View
cd/data/tmp/Scorell
Four: Create a partitioned table and a temporary table without partitions
The imported cluster is the HDP-2.7.1 release version of Hortonworks.
The partition table is our final target table, which is overused when there are no partitioned temporary tables.
Into Hive
sudo -u hdfs Hive
Create a table with partitions
CREATE TABLE score ( ID int, studentid int, score Double string);
Create a temporary table with no partitions
CREATE TABLE Score1 ( IDint, int, Double, string
);
V: Import data into a staging table
Load ' /data/tmp/score ' into table Score1;
We'll check the incoming data.
Select * from Score1;
VI: Import from temp table to partition table
set hive. exec . Dynamic.partitiontrue; set hive. exec . Dynamic.partition.mode= nonstrict; set hive. exec . max . Dynamic.partitions.pernode= 10000 insert overwrite Span style= "color: #0000ff;" >table score partition (openingtime ) * from score1;
Inquire
Select * from score;
We'll look at the files in the hive in HDFs.
Hadoop FS-LS -r/apps/hive/warehouse/score
Can be clearly seen according to the Openingtime partition.
Seven: Delete temporary tables
Drop Table Score1
Eight: Delete temporary data
RM -rf/data/tmp/score
This way our hive cluster data migration is over.
--------------------------------------------------------------------
Here, the content of this chapter is complete.
Series Index
"Source" Self-Learning Hadoop series index from zero
This article is copyright Mephisto and Blog Park is shared, welcome reprint, but must retain this paragraph statement, and give the original link, thank you for your cooperation.
The article is written by elder brother (Mephisto), Sourcelink
"Source" self-learning Hadoop from zero: Hive data import and export, cluster data migration