First, the installation mode introduction:
Hive official on-line introduces 3 kinds of hive installation methods, corresponding to different application scenarios.
1, in-line mode (meta data to protect the village in the embedded derby species, allow a session link, try multiple session links will be error)
2. Local mode (install MySQL locally instead of Derby store metadata)
3. Remote mode (remotely install MySQL instead of Derby storage metadata)
Second, installation environment and the premise description:
First, hive is dependent on the Hadoop system, so it is necessary to ensure that the Hadoop cluster environment is built before running hive.
The Hadoop version used in this article is 2.5.1,hive version 1.2.1.
Os:linux Centos 6.5 64-bit
Jdk:java Version "1.7.0_79"
Suppose you have downloaded the hive installation package and installed it to the/home/install/hive-1.2.1
Set the HIVE_HOME environment variable in ~/.bash_profile:
Export hive_home=/home/Install/hive-1.2. 1
Three, embedded mode installation:
The metadata for this installation mode is embedded in the Derby database, allowing only one session to connect and the data to be stored on HDFS.
1. Switch to the hive_home/conf directory and execute the following command:
CP hive-env. sh. Template hive-env. SH vim Hive-env. SH
In hive-env.sh, add the following:
hadoop_home=/home/Install/hadoop-2.5. 1
2, start hive, because the hive_home has been added to the environment variable, so here directly at the command line to knock Hive:
Then we see that the corresponding directories have been created on the HDFs of Hadoop.
Note that as long as the above 2 steps can complete the installation and start of the embedded mode, do not superfluous ... Like below.
================================ " don't look under this paragraph " ==============================================
( void ) 2, provide a hive base configuration file, execute the following code, is to change the files in the Conf directory to a configuration file:
CP Hive-default.xml.template Hive-site.xml
( void ) 3, start hive, since the hive_home has been added to the environment variable, so here directly at the command line to knock Hive:
( void ) The result is an error, look at the bug log mentions System:java.io.tmpdir, this configuration item is mentioned in Hive-site.xml.
( void ) We create a temp directory/opt/tem/hive-1.2.1/iotemp, and modify the value of System:java.io.tmpdir in Hive-site.xml:
mkdir -p/opt/tem/hive-1.2. 1/iotempvim hive-site.xml
( void ) Enter the following command in the VIM edit interface to complete the replacement:
:%[email protected]\${system:java.io.tmpdir}@/opt/tem/hive-1.2. 1/[email protected]
( void ) 4. Restart Hive:
( void ) reported such an error: Java.lang.IncompatibleClassChangeError:Found class JLine. Terminal, but interface was expected.
( obsolete ) query data said that the Hadoop directory exists in the old version of JLine, replaced the line. after copying, be careful to delete the original version of the jar package .
CP /home/install/hive-1.2. 1/lib/jline-2.12. jar/home/Install/hadoop-2.5. 1/share/hadoop/yarn/lib/
Rm-rf/home/install/hadoop-2.5.1/share/hadoop/yarn/lib/jline-0.9.94.jar
( void ) reboot again, OK.
Four, local mode installation:
The difference between this installation and the embedded is that the embedded derby is no longer used as a storage medium for metadata, but instead uses other databases such as MySQL to store the metadata.
This approach is a multi-user mode that runs multiple user client connections to a database. This approach typically uses hive internally as a company.
There is a premise that every user must have access rights to MySQL, that is, every client user needs to know the MySQL username and password before it can be used.
The following is a formal setup, which requires that the Hadoop system be started properly and that the MySQL database is properly installed.
1, first login to MySQL, create a database, here named Hive, database name can be arbitrarily defined.
Create a hive user and give all permissions:
' Hive '@'localhost'123456';
GRANT all privileges on *. * to hive identified by ' 123456 ' with GRANT OPTION;
2. Copy the MySQL JDBC driver package to hive's installation directory, and the driver package will find the download itself.
CP mysql-connector-java-5.1.32-bin.jar/home/install/hive-1.2. 1/lib/
3, the hive_home/conf under the hive-default.xml.template copy:
CP Hive-default.xml.template Hive-site.xml
4. Modify the Hive-site.xml file:
This profile has more than 3,300 lines, select several options to modify it.
A, modify the Javax.jdo.option.ConnectionURL property.
< Property> <name>Javax.jdo.option.ConnectionURL</name> <value>Jdbc:mysql://localhost/hive?createdatabaseifnotexist=true</value> <Description>JDBC connect string for a JDBC Metastore</Description></ Property>
B, modify the Javax.jdo.option.ConnectionDriverName property.
< Property> <name>Javax.jdo.option.ConnectionDriverName</name> <value>Com.mysql.jdbc.Driver</value> <Description>Driver class name for a JDBC metastore</Description></ Property>
C, modify the Javax.jdo.option.ConnectionUserName property. That is, the database user name.
< Property> <name>Javax.jdo.option.ConnectionUserName</name> <value>Hive</value> <Description>Username to use against Metastore database</Description></ Property>
D, modify the Javax.jdo.option.ConnectionPassword property. That is, the database password.
< Property> <name>Javax.jdo.option.ConnectionPassword</name> <value>123456</value> <Description>Password to use against Metastore database</Description></ Property>
E, add the following attribute hive.metastore.local:
< Property> <name>Hive.metastore.local</name> <value>True</value> <Description>Controls whether to connect to remove Metastore server or open a new Metastore server in Hive Client JVM</Description></ Property>
F, modify the Hive.server2.logging.operation.log.location property because no specific path is specified in the default configuration.
< Property> <name>Hive.server2.logging.operation.log.location</name> <value>/tmp/hive/operation_logs</value> <Description>Top level directory where operation logs was stored if logging functionality is enabled</Descripti on></ Property>
G, modify the Hive.exec.local.scratchdir property.
< Property> <name>Hive.exec.local.scratchdir</name> <value>/tmp/hive</value> <Description>Local scratch space for Hive jobs</Description></ Property>
H, modify the Hive.downloaded.resources.dir property.
< Property> <name>Hive.downloaded.resources.dir</name> <value>/tmp/hive/resources</value> <Description>Temporary local directory for added resources in the remote file system.</Description></ Property>
I, modify the attribute Hive.querylog.location property.
< Property> <name>Hive.querylog.location</name> <value>/tmp/hive/querylog</value> <Description>Location of Hive run time structured log file</Description></ Property>
5. Configure the log4j configuration file for hive.
CP Hive-log4j.properties.template hive-log4j.properties
6. Replace the jline-2.12. Jar under hive with the package from Hadoop, or you will get an error.
CP /home/install/hive-1.2. 1/lib/jline-2.12. jar/home/Install/hadoop-2.5. 1/share/hadoop/yarn/lib/
RM -rf/home/Install/hadoop-2.5. 1/share/hadoop/yarn/lib/jline-0.9. 94. jar
7. Start hive with the following interface:
Five, remote mode installation, that is, server mode.
This mode needs to be used in conjunction with the BEELINE+HIVESERVER2 provided in the Hive installation directory.
The principle is to start the metadata as a separate service. Various clients connect via Beeline, without needing to know the password of the database before connecting.
1, first execute the HIVESERVER2 command:
./hiveserver2 Start
After startup the command line has been listening without exiting, and we can see that it listens on port 10000.
2. Open a new command-line window and execute the Beeline command:
[Email protected] bin]# Beeline Beeline version 1.2.1 by Apache hivebeeline>!connect Jdbc:hive2://node5:10000connect ing to jdbc:hive2://node5:10000enter username for jdbc:hive2://node5:10000:hiveenter password for JDBC:HIVE2://NODE5 : 10000: ******
The error log is as follows:
error:failed to open new session:java.lang.RuntimeException:java.lang.RuntimeException: Org.apache.hadoop.security.AccessControlException:Permission denied:user=hive, Access=execute, inode= "/tmp": Root: SUPERGROUP:DRWX------at Org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkFsPermission ( fspermissionchecker.java:271) at Org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check ( fspermissionchecker.java:257) at Org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkTraverse ( fspermissionchecker.java:208) at Org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission ( fspermissionchecker.java:171) at Org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission ( fsnamesystem.java:5904) at Org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getFileInfo (Fsnamesystem.java : 3691) at Org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getFileInfo (namenoderpcserver.java:803) at Org.apache.hadoop.hdfs.protocolPB.CliEntnamenodeprotocolserversidetranslatorpb.getfileinfo (clientnamenodeprotocolserversidetranslatorpb.java:779) at Org.apache.hadoop.hdfs.protocol.proto.clientnamenodeprotocolprotos$clientnamenodeprotocol$2.callblockingmethod (Clientnamenodeprotocolprotos.java) at Org.apache.hadoop.ipc.protobufrpcengine$server$protobufrpcinvoker.call ( protobufrpcengine.java:585) at Org.apache.hadoop.ipc.rpc$server.call (rpc.java:928) at Org.apache.hadoop.ipc.Server $Handler $1.run (server.java:2013) at Org.apache.hadoop.ipc.server$handler$1.run (server.java:2009) at Java.security.AccessController.doPrivileged (Native Method) at Javax.security.auth.Subject.doAs (subject.java:415) At Org.apache.hadoop.security.UserGroupInformation.doAs (usergroupinformation.java:1614) at Org.apache.hadoop.ipc.server$handler.run (server.java:2007) (state=,code=0) 0:jdbc:hive2://node5:10000 (closed) >
That is, the hive user does not have enough permissions on HDFs for/tmp.
This directly sets the permissions for HDFs to maximum.
Hadoop FS-chmod777 /tmp
Reconnect: Succeeded.
Three Ways to install hive (inline mode, local mode remote mode)