Configure MySQL as Metastore in Hive Learning

Source: Internet
Author: User
Hive uses the Derby database as the metastore in Built-in mode by default. The biggest drawback of this mode is that multiple clients cannot be connected to the metastore at the same time. Therefore, it is only suitable for the purpose of learning and testing, to use Hive in actual production, you need to configure metastore to local mode or remote mode. Now we will introduce how to configure metas in local mode.

Hive uses the Derby database as the metastore in Built-in mode by default. The biggest drawback of this mode is that multiple clients cannot be connected to the metastore at the same time. Therefore, it is only suitable for the purpose of learning and testing, to use Hive in actual production, you need to configure metastore to local mode or remote mode. Now we will introduce how to configure metas in local mode.

Hive uses the Derby database as the metastore in Built-in mode by default. The biggest drawback of this mode is that multiple clients cannot be connected to the metastore at the same time. Therefore, it is only suitable for the purpose of learning and testing, to use Hive in actual production, you need to configure metastore to local mode or remote mode. Now we will introduce how to configure metastore in local mode and use MySQL5.6.19 as the database. Install the MySQL database before configuration. In general, MySQL has been installed in Linux (at least in my system), but the database version is usually relatively low, to use the latest MySQL version, you must first uninstall the previous version. You can use the following statement to query installed MySQL:

rpm –qa | grep –i mysql

Then, use the following statement to delete MySQL from the query result:

Rpm-e xxxxx or rpm-e -- nodeps xxxxx

After the installation is complete, use the following statement to install the MySQL server:

[Root @ hadoophadoop] # rpm-ivh MySQL-server-5.6.19-1.el6.x86_64.rpmPreparing... ################################## [100%] 1: mySQL-server ################################### [100%] a randompassword has been set for the MySQL root USER! You will find that password in '/root /. mysql_secret '. also, the account for the anonymous user has beenremoved. # The MySQL-5.6.19 version will generate a random password for the root user and save it in/root /. in mysql_secret, this password is used when the root user logs on for the first time. In addition, you can run:/usr/bin/mysql_secure_installation, which will also give you the option of removing the test database. this is stronugly recommended for production servers. # It is recommended to run the/usr/bin/mysql_secure_installation script in production. This script can remove the test database. This method is recommended in the production environment. New default config file was created as/usr/my. cnf and will be used by default by the server when you start it. you may edit this file to change server settings # Save the new configuration file in/usr/my. cnf

The statements and results for installing the MySQL client are as follows:

[root@hadoophadoop]# rpm -ivh MySQL-client-5.6.19-1.el6.x86_64.rpmPreparing...           ###############################[100%]1:MySQL-client       ################################ [100%]

After installing MySQL, run the following statement to log on to the database. The password of the root user is saved in/root/. mysql_secret:

mysql –u root –p

Enter the password in the above file to log on to the database and execute the following statement. After modifying the password of the root user, you can use the new password in future logon.

mysql> use mysqlERROR 1820(HY000): You must SET PASSWORD before executing this statementmysql> set PASSWORD=PASSWORD('123456');Query OK, 0 rowsaffected (0.12 sec)

After installing the MySQL database, you need to create a database that stores metastore or create a database during connection. Create a database hive_db first:

create database hive_db;

Then create the user of the metastore database and assign the maximum permissions to the user:

grant all privileges on hive_db.* to Hadoop identified by ‘hadoop’

After configuring the database, you need to modify the Hive configuration file. Copy the hive-default.xml.template under the $ HIVE_HOME/conf directory to a hive-site.xml:

cp hive-default.xml.template hive-site.xml

Modify the following configuration information and specify MySQL as the database for storing metastore:

   
  
   javax.jdo.option.ConnectionURL
    
  
   jdbc:mysql://Hadoop:3306/hive_db?createDatabaseIfNotExist=true
    
  
   JDBC connect string for aJDBC metastore
  
 
  
  
   javax.jdo.option.ConnectionDriverName
    
  
   com.mysql.jdbc.Driver
    
  
   Driver class name for aJDBC metastore
  
 
  
  
   javax.jdo.option.ConnectionUserName
    
  
   hadoop
    
  
   username to use againstmetastore database
  
 
  
  
   javax.jdo.option.ConnectionPassword
    
  
   hadoop
    
  
   password to use againstmetastore database
  
 

After modifying the configuration file, place the MySQL driver package to CLASSPATH. You can place it under $ {HIVE_HOME}/lib, and then start Hive.

Log on to MySQL as a hadoop user, go to the hive_db database, and run the show tables statement. You can find many tables, such as dbs and Tsung, as shown in:


You can query which databases exist in Hive as follows:


So far, the configuration of MySQL to store metastore in local mode has been completed, as shown in the figure above. Of course, some problems also occur during the actual execution. For example, if the MySQL driver is not added to the CLASSPATH, the installation of MySQL is incorrect. The biggest problem is not from Hive, install MySQL, which means that the database configured as metastore is still very simple under normal conditions of MySQL.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.