1. Hive Introduction 1.1 belongs to the role of data warehouse in the hadoop ecosystem. It can manage data in hadoop and query data in hadoop. Basically, hive is an SQL parsing engine. Hive can convert SQL queries to MapReduce jobs for running. Hive has a set of Ing tools
1. Hive Introduction 1.1 belongs to the role of data warehouse in the hadoop ecosystem. It can manage data in hadoop and query data in hadoop. Basically, hive is an SQL parsing engine. Hive can convert SQL queries to MapReduce jobs for running. Hive has a set of Ing tools
1. Hive Introduction
1.1 belongs to the role of data warehouse in the hadoop ecosystem. It can manage data in hadoop and query data in hadoop.
Basically, hive is an SQL parsing engine. Hive can convert SQL queries to MapReduce jobs for running.
Hive has a set of Ing tools to convert SQL into MapReduce jobs and convert SQL tables and fields into files (folders) and columns in HDFS.
This ing tool is called metastore and is generally stored in derby and mysql.
The default location of 1.2 hive in hdfs is/user/hive/warehouse, which is determined by the attribute hive. metastore. warehouse. dir in the profile hive-conf.xml.
2. Install hive
(1) decompress, rename, and set Environment Variables
(2) In the directory $ HIVE_HOME/conf/, execute the command mv hive-default.xml.template hive-site.xml rename
Under the $ HIVE_HOME/conf/directory, execute the command mv hive-env.sh.template hive-env.sh rename
(3) modify the hadoop configuration file hadoop-env.sh, the modification content is as follows:
Export HADOOP_CLASSPATH =.: $ CLASSPATH: $ HADOOP_CLASSPATH: $ HADOOP_HOME/bin
(4) under the $ HIVE_HOME/bin directory, modify the file hive-config.sh and add the following:
Export JAVA_HOME =/usr/local/jdk
Export HIVE_HOME =/usr/local/hive
Export HADOOP_HOME =/usr/local/hadoop
3. Install mysql (server and client)
(1) Delete mysql-related library information installed on linux. Rpm-e xxxxxxx -- nodeps
Run the command rpm-qa | grep mysql to check whether the database is deleted.
(2) run the command rpm-I mysql-server-******** to install the mysql server.
(3) Start the mysql server and execute the command mysqld_safe &
(4) run the command rpm-I mysql-client-******** to install the mysql client.
(5) execute the command mysql_secure_installation to set the root user password
(6) by default, mysql does not allow hive connection. to authorize the connection on the mysql client, run the following command: grant all on hive. * to 'root' @ '%' identified by 'your password'; then, refresh the command: flush privileges;
4. Use mysql as the metastore of hive
(1) Place the jdbc driver of mysql under the lib directory of hive (because hive wants to connect to mysql)
(2) modify the hive-site.xml file, the modification content is as follows:
Javax. jdo. option. ConnectionURL
// Name of the corresponding database in mysql
Jdbc: mysql: // hadoop0: 3306/hive? CreateDatabaseIfNotExist = true
Javax. jdo. option. ConnectionDriverName
Com. mysql. jdbc. Driver
Javax. jdo. option. ConnectionUserName
Root
Javax. jdo. option. ConnectionPassword
Admin