1. Official Document: https://docs.mongodb.com/ecosystem/tools/hadoop/
2.Hive Introduction:
Hive Features:
1.hive is a data warehouse, which relies on HDFS in comparison to data warehouses such as Oracle,mysql.
2.hive is the SQL parsing engine that transforms SQL statements into Map/reduce tasks and then executes on Hadoop HDFS
The 3.hive table is actually a directory, and the data for the table in hive is in the file
Hive Meta Data:
Hive stores metadata in the database (Metastore), supports Mysql,derby (default), Oracle
Metadata: Includes table name, table column, partition, and attributes, table properties (whether external table), and table's storage directory, excluding any data
3.Hive Installation:
Hive Run Mode:
1. Embedding mode: Hive's metadata is saved in its own derby, allowing only one connection at a time, and more for demo
2. Local mode: Metabase in MySQL, MySQL is on the same physical machine as hive.
3. Remote mode: Same as 2, just MySQL running on other machines.
Local mode installation: storing metadata with MySQL
1. Install MySQL (reference website)
2. Put the MySQL java-driver into the $hive_home/lib
3. Create Hive-site.xml, as follows, note that file name coercion is this.
Also note that & needs to be escaped in XML, that is, &
<?XML version= "1.0" encoding= "UTF-8" standalone= "no"?><?xml-stylesheet type= "text/xsl" href= "configuration.xsl "?><Configuration> < Property> <name>Javax.jdo.option.ConnectionURL</name> <value>Jdbc:mysql://localhost:3306/hive?createdatabaseifnotexist=true&connecttimeout=10000</value> </ Property> < Property> <name>Javax.jdo.option.ConnectionDriverName</name> <value>Com.mysql.jdbc.Driver</value> </ Property> < Property> <name>Javax.jdo.option.ConnectionUserName</name> <value>Hivemeta</value> </ Property> < Property> <name>Javax.jdo.option.ConnectionPassword</name> <value>Hivemeta</value> </ Property></Configuration>
Note that you write in XML that the user should have Telnet permission in the MySQL type
Create a MySQL user and give all permissions, login to the root account, and then execute
Grant all privileges on * * to [e-mail protected] "%" identified by "testhive"//* * represents all tables, "%" represents all ip,testhive outside of this machine for user name and password
Grant all privileges on * * to [email protected] ' localhost ' identified by ' testhive '
Flush Privileges
3. Start MySQL.
4. Execute bin/hive, enter show tables, output OK, the installation is successful.
If Failed:semanticexception org.apache.hadoop.hive.ql.metadata.HiveException:java.lang.RuntimeException:Unable appears To instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient, or any command entered in hive is unresponsive, Indicates that the Metastore failed to start, indicating that your metadata connection is failing. Execution at this time
Hive-hiveconf Hive.root.logger=debug,console-service Metastore starts the Mestore and sets the error message level, troubleshooting the error message.
You can also netstat-an|grep 3306 to view the MySQL port connection situation, if the hive metabase connection is successful, the connection status should be established.
4. Start importing Data
1. Make sure hive is working properly.
2. Reference Https://github.com/mongodb/mongo-hadoop/wiki/Hive-Usage
3. If your hive version is the same as me 2.3.x, then congratulations, because you will encounter a lot of classnotfoundexception
4. Specific steps:
1. Download 3 Packages: Core, hive, and the Java driver, import into Lib
2. Write SQL according to your own business rules.
3. Execute hive-hiveconf hive.root.logger=debug,console-f xxx.sql.
4. If you return to OK, congratulations, your entire process is successful, if you fail, don't worry, you just lack some packages.
5. In my hive2.3.3+mongodb4.x environment, I have encountered a lot of problems:
1.sql parsing failed, need '; ' in SQL file As line terminator
2. Class: Serde2 was not found, and then I looked at the Mongostoragehandler code and found that the class was only available in 1.x hive, only to download and get the jar package.
3. In short what kind of what to give the package, with Jar-vtf|grep to see if there is a class in the package, and then make sure to import into the lib.
5. Complete the simple Hello World process
Importing MongoDB tables into hive