Hive integrates Mysql as metadata

Source: Internet
Author: User
By default, Hive metadata is stored in the embedded Derby database, and only one session connection is allowed, which is only suitable for simple testing. To support multi-user multi-session, an independent metadatabase is required. We use MySQL as the metadatabase. Hive provides excellent support for MySQL internally. to configure an independent metadatabase, we need to add the following

By default, Hive metadata is stored in the embedded Derby database, and only one session connection is allowed, which is only suitable for simple testing. To support multi-user multi-session, an independent metadatabase is required. We use MySQL as the metadatabase. Hive provides excellent support for MySQL internally. to configure an independent metadatabase, we need to add the following

By default, Hive metadata is stored in the embedded Derby database, and only one session connection is allowed, which is only suitable for simple testing. To support multi-user multi-session, an independent metadatabase is required. We use MySQL as the metadatabase. Hive provides good support for MySQL internally, to configure an independent metadatabase, follow these steps:

Step 1: Install the MySQL server and MySQL client and start the MySQL service.
This step is omitted, please refer to http://www.cnblogs.com/linjiqin/archive/2013/03/04/2942497.html

Step 2: Install Hive
This step is omitted, please refer to http://www.cnblogs.com/linjiqin/archive/2013/03/04/2942402.html

Step 3: Create a MySQL account for Hive and grant sufficient permissions. Run the following command:

hadoop@ubuntu:~$ mysql -uroot -pmysqlmysql> CREATE USER 'hive' IDENTIFIED BY 'mysql';mysql> GRANT ALL PRIVILEGES ON *.* TO 'hive'@'%' WITH GRANT OPTION;mysql> flush privileges;

By default, Mysql only allows local login, so you need to modify the configuration file to bind the address to comment out:

root@ubuntu:~# sudo gedit /etc/mysql/my.cnf

Find the following content:

# Instead of skip-networking the default is now to listen only on # localhost which is more compatible and is not less secure. # bind-address = 127.0.0.1 <--- comment out this line and you can log on remotely.

Restart the mysql service:

sudo service mysql restart

Step 4: Create a Hive-specific metabaseRemember to log in with the "hive" account you just created.

mysql> exit;hadoop@ubuntu:~$ mysql -uhive -pmysqlmysql> create database hive;

Step 5: Add the following configuration to the file "hive-site.xml" under the conf directory of Hive:

 
 
     
          
   
    hive.metastore.local
           
   
    true
       
      
          
   
    javax.jdo.option.ConnectionURL
           
   
    jdbc:mysql://192.168.11.157:3306/hive?characterEncoding=UTF-8
       
      
          
   
    javax.jdo.option.ConnectionDriverName
           
   
    com.mysql.jdbc.Driver
       
      
          
   
    javax.jdo.option.ConnectionUserName
           
   
    hive
       
      
          
   
    javax.jdo.option.ConnectionPassword
           
   
    mysql
       
  
 

From the front we know that our "hive-site.xml" is a copy of a "hive-default.xml.template", there are a lot of configuration parameters in it, but not all we need, we know, the Hive system loads two configuration files, one default configuration file "hive-default.xml", and the other is the user-defined file "hive-site.xml ". User-Defined prevails when the value of configuration parameters in the hive-site.xml is inconsistent with that in the hive-default.xml file. Therefore, we will delete all the parameters we don't need, leaving only the content shown above.
Note: in fact, it is very troublesome to modify the configuration file here. We have all set up FTP. Why not use it. the configuration file on Hadoop is downloaded and modified as required. It is very convenient to operate such a file on Windows. After completing the configuration file, upload the file to overwrite the original one.

Step 6: copy the JDBC driver package of MySQL to the lib directory of Hive.
JDBC driver package version: mysql-connector-java-5.1.18-bin.jar

Step 7: START Hive Shell,Run the "show tables;" command. If no error is reported, the Hive based on the independent metabase has been installed successfully.

hadoop@ubuntu:~$ hiveLogging initialized using configuration in file:/home/hadoop/hive-0.9.0/conf/hive-log4j.propertiesHive history file=/tmp/hadoop/hive_job_log_hadoop_201303041631_450140463.txthive> show tables;OKTime taken: 1.988 secondshive> 

Step 8: Verify that the Hive configuration is incorrectGo to the Hive shell to create a table, and you can see the corresponding metadatabase information in the Hive database of MySQL.
1) create a data table on Hive
Hive> create table xp (id INT, name string) row format delimited fields terminated by '\ T ';
2) view metadata information from the MySQL database
SQL statement used:
Use hive; // use the hive Database
Show tables; // display data tables in the hive Database
Select * from Tsung; // view hive metadata
By now, Hive has integrated Mysql as metadata.

Remote Installation
In remote mode, you can configure metastore to a remote machine. The configuration items to be added to the hive-site.xml file on the basis of the standalone mode are as follows:

     
  
   
Hive. metastore. local
      
  
   
Local
  
 
     
  
   
Hive. metastore. uris
      
  
   
Uri1, uri2 ,...
  // You can configure multiple Uris.
  
   
JDBC connect string for a JDBC metastore
  
 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.