1. Description
In real-world scenarios there can be some format of more structured data files that need to be imported into Hbase,phoenix to provide two ways to load CSV formatted files in Phoenix's data sheet. One is the way to load small batches of data using a single-threaded psql tool, one that uses mapreduce jobs to handle large quantities of data. The first way is relatively simple here is not introduced, want to know can refer to the official documents.
Http://phoenix.apache.org/bulk_dataload.html
2. Create a table
Create a user table in the CLI interface of Phoenix.
table user (id varchar primary key,account varchar ,passwd varchar);
3. Add test Data
Create the Data_import.txt in the "phoenix_home" directory, as follows:
001,google,am
002,baidu,bj
003,alibaba,hz
4. Execution of MapReduce
Perform Mr Jobs in the "Phoenix_home" directory (the use of commands is related to the version of PHOENIX).
$ hadoop_classpath=/usr/Local/cdh-5.2.0/hbase-0.98.6/lib/hbase-protocol-0.98.6-cdh5.2.0.jar:/usr/local/cdh-5.2.0/ Hbase-0.98.6/conf Hadoop jar Phoenix -4.2.2 -client.jar org.apache< Span class= "hljs-built_in" >.phoenix.mapreduce-t user -i file: ///usr/local/cdh-5.2.0/phoenix-4.2.2/data_import.txt-z 192.168.187.128,192.168.187.129,192.168.187.130:2181
The parameters mean the following table:
High-energy Warning: Hbase-protocol-0.98.6-cdh5.2.0.jar This jar package is related to the version of HBase, and if it is a different version of HBase, replace it yourself.
The path to the data file reads the HDFs file by default, forcing the addition of the prefix file:/// specified as a local file path, but the process of Mr Execution is still an error saying that the file path could not be found:
Error:java.io.FileNotFoundException:File File:/usr/local/cdh-5.2.0/phoenix-4.2.2/data_import.txt does not exist
At Org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus (rawlocalfilesystem.java:524)
At Org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal (rawlocalfilesystem.java:737)
At Org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus (rawlocalfilesystem.java:514)
At Org.apache.hadoop.fs.FilterFileSystem.getFileStatus (filterfilesystem.java:398)
At Org.apache.hadoop.fs.checksumfilesystem$checksumfsinputchecker. (checksumfilesystem.java:140)
But the final data was successfully loaded into Phoenix.
Finally, the test data data_import.txt placed in the/phoenix/test/directory of HDFs, with the following command execution without any error.
al/cdh-5.2.0/hbase-0.98 .6/conf Hadoop jar phoenix-4.2.2- Client.jar org.apache .phoenix.mapreduce .txt- z 192.168.187 .128,192.168.187 .129,192.168.187 .130:2181
The job will be submitted to yarn by ResourceManager for resource allocation
Phoenix uses MapReduce to load large volumes of data