Environment:Hadoop2.2.0hive0.13.1ubuntu 14.04 Ltsjava Version "1.7.0_60"
oracle10g
* * * Welcome reprint. Please indicate source * * *
http://blog.csdn.net/u010967382/article/details/38709751
Download the installation package at the following addressHttp://mirrors.cnnic.cn/apache/hive/stable/apache-hive-0.13.1-bin.tar.gz
The installation package is extracted to the server/home/fulong/hive/apache-hive-0.13.1-bin
Change the environment variable, add the following contentExport Hive_home=/home/fulong/hive/apache-hive-0.13.1-binExport path= $HIVE _home/bin: $PATH
go to the Conf folder to rename the copy template configuration file[Email protected]:~/hive/apache-hive-0.13.1-bin/conf$ lshive-default.xml.template Hive-exec-log4j.properties.templatehive-env.sh.template Hive-log4j.properties.template[email protected]:~/Hive/ apache-hive-0.13.1-bin/conf$CP hive-env.sh.template hive-env.sh[Email protected]:~/hive/apache-hive-0.13.1-bin/conf$CP hive-default.xml.template Hive-site.xml[Email protected]:~/hive/apache-hive-0.13.1-bin/conf$ lshive-default.xml.template hive-env.sh.template hi Ve-log4j.properties.templatehive-env.shHive-exec-log4j.properties.templateHive-site.xml
change the following places in the configuration file hive-env.sh. Separate the root folder for Hadoop, Hive's conf and Lib folders# Set Hadoop_home to point to a specific HADOOP install directoryHadoop_home=/home/fulong/hadoop/hadoop-2.2.0
# Hive Configuration Directory can be controlled by:Export Hive_conf_dir=/home/fulong/hive/apache-hive-0.13.1-bin/conf
# Folder containing extra ibraries required for hive Compilation/execution can is controlled by:Export Hive_aux_jars_path=/home/fulong/hive/apache-hive-0.13.1-bin/lib
Change the following number of connections in the configuration file hive-site.sh to Oracle related parameters <property> <name>javax.jdo.option.ConnectionURL</name> <value>JDBC:ORACLE:THIN:@192.168.0.138:1521:ORCL</value> <DESCRIPTION>JDBC connect string for a JDBC metastore</description></property>
<property> <name>javax.jdo.option.ConnectionDriverName</name> <value>
Oracle.jdbc.driver.OracleDriver</value> <description>driver class name for a JDBC metastore</description></property>
<property> <name>javax.jdo.option.ConnectionUserName</name> <value>
Hive</value> <description>username to use against Metastore database</description></property>
<property> <name>javax.jdo.option.ConnectionPassword</name> <value>
HIVEFBI</value> <description>password to use against Metastore database</description></property>
Configure log4jCreate log4j folder under $hive_home to store log file copy template rename [email protected]:~/hive/apache-hive-0.13.1-bin/conf$ cp Hive-log4j.properties.templatehive-log4j.properties
Change the folder where the log is storedHive.log.dir=/home/fulong/hive/apache-hive-0.13.1-bin/log4j
Copy the Oracle JDBC Jar packageCopy the appropriate Oracle JDBC package to $hive_home/lib
Start Hive[email protected]:~/hive/apache-hive-0.13.1-bin$ hive14/08/20 17:14:05 INFO configuration.deprecation: Mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces14/08/20 17:14:05 INFO Configuration.deprecation:mapred.min.split.size is Deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize14/08/20 17:14:05 INFO configuration.deprecation: Mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative14/08/20 17:14:05 INFO configuration.deprecation: Mapred.min.split.size.per.node is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.node14/08/20 17:14:05 INFO configuration.deprecation : Mapred.input.dir.recursive is deprecated. Instead, use mapreduce.input.fileinputformat.input.dir.recursive14/08/20 17:14:05 INFO configuration.deprecation: Mapred.min.split.size.per.rack is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.rack14/08/20 17:14:05 INFO configuration.deprecation: Mapred.max.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.maxsize14/08/20 17:14:05 INFO configuration.deprecation: Mapred.committer.job.setup.cleanup.needed is deprecated. Instead, use MAPREDUCE.JOB.COMMITTER.SETUP.CLEANUP.NEEDED14/08/20 17:14:05 WARN conf. Hiveconf:deprecated:hive.metastore.ds.retry.* no longer have any effect. use hive.hmshandler.retry.* instead
Logging initialized using configuration in file:/home/fulong/hive/apache-hive-0.13.1-bin/conf/ Hive-log4j.propertiesjava HotSpot (TM) 64-bit Server VM warning:you has loaded library/home/fulong/hadoop/hadoop-2.2.0 /lib/native/libhadoop.so which might has disabled stack guard. The VM would try to fix the stack guard now. It's highly recommended that's the library with ' execstack-c <libfile> ', or link it with '-Z noexecstack '. HIV E>
ValidationPlan to create a table store Sogou lab Download user search behavior log.
Data: http://www.sogou.com/labs/dl/q.html
first create the table:Hive>CREATE TABLE Searchlog (Time string,id string,sword string,rank int,clickrank int,url string) row format Delimited fields terminated by ' \ t ' lines terminated by ' \ n ' stored as textfile;
this will cause an error:failed:execution Error, return code 1 from Org.apache.hadoop.hive.ql.exec.DDLTask. Metaexception (Message:javax.jdo.JDODataStoreException:An exception is thrown while Adding/validating class (es): Ora-01754:a table contain only one column of type LONG
Workaround: Open the Hive-metastore-0.13.0.jar in ${hive_home}/lib with the Unzip tool and discover a file named Package.jdo. Open the file and locate the following content.<field name= "vieworiginaltext" default-fetch-group= "false" ><column name= "View_original_text" jdbc-type= "longvarchar"/></field><field name= "viewexpandedtext" default-fetch-group= "false" ><column name= "View_expanded_text" jdbc-type= "longvarchar"/></field>The ability to discover the types of columns View_original_text and View_expanded_text is LongVarChar, corresponding to long in Oracle, which contradicts the requirement that Oracle tables only have a column of type long. So there was a mistake.
Change the value of the jdbc-type of the two columns to CLOB according to the recommendations of the Hive website. The contents of the changes, as seen below. <field name= "Vieworiginaltext" default-fetch-group= "false" > <column name= "View_original_text" jdbc-type= " CLOB "/> </field> <field name= "Viewexpandedtext" default-fetch-group= "false" > <column name= "View_expanded_text" jdbc-type= " CLOB "/> </field>
After the change, restart hive.
Run the command to create the table again. CREATE TABLE succeeded:
hive> CREATE TABLE Searchlog (Time string,id string,sword string,rank int,clickrank int,url string) row format Delimited fields terminated by ' \ t ' lines terminated by ' \ n ' stored as textfile; OKTime taken:0.986 seconds
load the local data into the table:hive> Load data local inpath '/home/fulong/downloads/sogouq.reduced ' overwrite into table searchlog; Copying data from file:/home/fulong/downloads/sogouq.reducedCopying file:file:/home/fulong/downloads/sogouq.reducedLoading data to table Default.searchlogrmr:DEPRECATED:Please use ' rm-r ' instead.Deleted Hdfs://fulonghadoop/user/hive/warehouse/searchlogTable default.searchlog Stats: [Numfiles=1, Numrows=0, totalsize=152006060, rawdatasize=0]OKTime taken:25.705 seconds
View all tables:hive> show tables; OKSearchlogTime taken:0.139 seconds, fetched:1 row (s)
Count rows:hive> Select COUNT (*) from Searchlog; Total jobs = 1launching Job 1 out of 1Number of reduce tasks determined at compile time:1In order to change the average load for a reducer (in bytes):Set hive.exec.reducers.bytes.per.reducer=<number>In order to limit the maximum number of reducers:Set hive.exec.reducers.max=<number>In order to set a constant number of reducers:Set mapreduce.job.reduces=<number>Starting Job = job_1407233914535_0001, Tracking URL = http://FBI003:8088/proxy/application_1407233914535_0001/ Kill Command =/home/fulong/hadoop/hadoop-2.2.0/bin/hadoop Job-kill job_1407233914535_0001Hadoop Job information for Stage-1: number of mappers:1; number of reducers:12014-08-20 18:03:17,667 Stage-1 map = 0, reduce = 0%2014-08-20 18:04:05,426 Stage-1 map = 100%, reduce = 0, Cumulative CPU 3.46 sec2014-08-20 18:04:27,317 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 4.74 secMapReduce Total Cumulative CPU time:4 seconds 740 msecEnded Job = job_1407233914535_0001MapReduce Jobs Launched:Job 0:map:1 reduce:1 Cumulative cpu:4.74 sec HDFs read:152010455 HDFs write:8 SUCCESSTotal MapReduce CPU time spent:4 seconds 740 msecOK1724264Time taken:103.154 seconds, fetched:1 row (s)
Copyright notice: This article blog original article. Blogs, without consent, may not be reproduced.
"Gandalf" Hive 0.13.1 on Hadoop2.2.0 + oracle10g deployment Detailed explanation