Hive is a Hadoop-based data warehouse tool that maps structured data files into a database table and provides simple SQL query functions to convert SQL statements.
Hive is a Hadoop-based data warehouse tool that maps structured data files into a database table and provides simple SQL query functions to convert SQL statements.
Introduction: Hive is a powerful data warehouse query language. Similar to SQL, this article describes how to build a Hive development and testing environment.
1. What is Hive?
Hive is a Hadoop-based data warehouse tool that maps structured data files into a database table and provides simple SQL query functions, you can convert SQL statements to MapReduce tasks for running. The advantage is that the learning cost is low. You can use SQL-like statements to quickly implement simple MapReduce statistics without having to develop special MapReduce applications. This is suitable for the statistical analysis of data warehouses.
2. Follow the Hive preparation conditions.
2.1 The Hadoop cluster environment has been installed
2.2 This article uses Ubuntu as the development environment (14.04)
3. Installation Steps
3.1 download Hive package: apache-hive-0.13.1-bin.tar.gz
3.2 decompress it to the/opt directory
Tar xzvf apache-hive-0.13.1-bin.tar.gz
3.3 set Environment Variables
Export HIVE_HOME =/opt/apache-hive-0.13
Export PATH = $ PATH: $ HIVE_HOME/bin
Export CLASSPATH = $ CLASSPATH: $ HIVE_HOME/bin
3.4. Modify hive-env.xml, copy hive-env.xml.template.
# Set HADOOP_HOME to point to a specific hadoop install directory
Hadoop op_home =/opt/hadoop-1.2.1
# Hive Configuration Directory can be controlled:
Export HIVE_CONF_DIR =/opt/apache-hive- 0.13/conf
3.5 modify the hive-site.xml, mainly modify the database connection information.
Hive. metastore. uris
Thrift: // 127.0.0.1: 9083
Thrift URI for the remote metastore. Used by metastore client to connect to remote metastore.
Javax. jdo. option. ConnectionURL
Jdbc: mysql: // BladeStone-Laptop: 3306/hive? CreateDatabaseIfNotExist = true
JDBC connect string for a JDBC metastore
Javax. jdo. option. ConnectionDriverName
Com. mysql. jdbc. Driver
Driver class name for a JDBC metastore
Javax. jdo. option. ConnectionUserName
Hive
Username to use against metastore database
Javax. jdo. option. ConnectionPassword
123456
Password to use against metastore database
Hadoop cluster-based Hive Installation
Differences between Hive internal tables and external tables
Hadoop + Hive + Map + reduce cluster installation and deployment
Install in Hive local standalone Mode
WordCount word statistics for Hive Learning
3.6 install mysql database
Sudo apt-get install mysql-server
3.7 create
3.8 In
For more details, please continue to read the highlights on the next page: