"Gandalf" Hive 0.13.1 on Hadoop2.2.0 + oracle10g deployment Detailed explanation

Source: Internet
Author: User
Tags deprecated file copy log4j

Environment:Hadoop2.2.0hive0.13.1ubuntu 14.04 Ltsjava Version "1.7.0_60"
oracle10g * * * Welcome reprint. Please indicate source * * *
http://blog.csdn.net/u010967382/article/details/38709751 Download the installation package at the following addressHttp://mirrors.cnnic.cn/apache/hive/stable/apache-hive-0.13.1-bin.tar.gz
The installation package is extracted to the server/home/fulong/hive/apache-hive-0.13.1-bin
Change the environment variable, add the following contentExport Hive_home=/home/fulong/hive/apache-hive-0.13.1-binExport path= $HIVE _home/bin: $PATH go to the Conf folder to rename the copy template configuration file[Email protected]:~/hive/apache-hive-0.13.1-bin/conf$ lshive-default.xml.template Hive-exec-log4j.properties.templatehive-env.sh.template Hive-log4j.properties.template[email protected]:~/Hive/ apache-hive-0.13.1-bin/conf$CP hive-env.sh.template hive-env.sh[Email protected]:~/hive/apache-hive-0.13.1-bin/conf$CP hive-default.xml.template Hive-site.xml[Email protected]:~/hive/apache-hive-0.13.1-bin/conf$ lshive-default.xml.template hive-env.sh.template hi Ve-log4j.properties.templatehive-env.shHive-exec-log4j.properties.templateHive-site.xml change the following places in the configuration file hive-env.sh. Separate the root folder for Hadoop, Hive's conf and Lib folders# Set Hadoop_home to point to a specific HADOOP install directoryHadoop_home=/home/fulong/hadoop/hadoop-2.2.0
# Hive Configuration Directory can be controlled by:Export Hive_conf_dir=/home/fulong/hive/apache-hive-0.13.1-bin/conf
# Folder containing extra ibraries required for hive Compilation/execution can is controlled by:Export Hive_aux_jars_path=/home/fulong/hive/apache-hive-0.13.1-bin/lib Change the following number of connections in the configuration file hive-site.sh to Oracle related parameters <property> <name>javax.jdo.option.ConnectionURL</name> <value>JDBC:ORACLE:THIN:@192.168.0.138:1521:ORCL</value> &LT;DESCRIPTION&GT;JDBC connect string for a JDBC metastore</description></property>
<property> <name>javax.jdo.option.ConnectionDriverName</name> <value> Oracle.jdbc.driver.OracleDriver</value> <description>driver class name for a JDBC metastore</description></property>
<property> <name>javax.jdo.option.ConnectionUserName</name> <value> Hive</value> <description>username to use against Metastore database</description></property>
<property> <name>javax.jdo.option.ConnectionPassword</name> <value> HIVEFBI</value> <description>password to use against Metastore database</description></property>
Configure log4jCreate log4j folder under $hive_home to store log file copy template rename [email protected]:~/hive/apache-hive-0.13.1-bin/conf$ cp Hive-log4j.properties.templatehive-log4j.properties
Change the folder where the log is storedHive.log.dir=/home/fulong/hive/apache-hive-0.13.1-bin/log4j Copy the Oracle JDBC Jar packageCopy the appropriate Oracle JDBC package to $hive_home/lib Start Hive[email protected]:~/hive/apache-hive-0.13.1-bin$ hive14/08/20 17:14:05 INFO configuration.deprecation: Mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces14/08/20 17:14:05 INFO Configuration.deprecation:mapred.min.split.size is Deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize14/08/20 17:14:05 INFO configuration.deprecation: Mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative14/08/20 17:14:05 INFO configuration.deprecation: Mapred.min.split.size.per.node is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.node14/08/20 17:14:05 INFO configuration.deprecation : Mapred.input.dir.recursive is deprecated. Instead, use mapreduce.input.fileinputformat.input.dir.recursive14/08/20 17:14:05 INFO configuration.deprecation: Mapred.min.split.size.per.rack is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.rack14/08/20 17:14:05 INFO configuration.deprecation: Mapred.max.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.maxsize14/08/20 17:14:05 INFO configuration.deprecation: Mapred.committer.job.setup.cleanup.needed is deprecated. Instead, use MAPREDUCE.JOB.COMMITTER.SETUP.CLEANUP.NEEDED14/08/20 17:14:05 WARN conf. Hiveconf:deprecated:hive.metastore.ds.retry.* no longer have any effect.  use hive.hmshandler.retry.* instead
Logging initialized using configuration in file:/home/fulong/hive/apache-hive-0.13.1-bin/conf/ Hive-log4j.propertiesjava HotSpot (TM) 64-bit Server VM warning:you has loaded library/home/fulong/hadoop/hadoop-2.2.0 /lib/native/libhadoop.so which might has disabled stack guard. The VM would try to fix the stack guard now. It's highly recommended that's the library with ' execstack-c <libfile> ', or link it with '-Z noexecstack '. HIV E> ValidationPlan to create a table store Sogou lab Download user search behavior log.

Data: http://www.sogou.com/labs/dl/q.html

first create the table:Hive>CREATE TABLE Searchlog (Time string,id string,sword string,rank int,clickrank int,url string) row format Delimited fields terminated by ' \ t ' lines terminated by ' \ n ' stored as textfile;
this will cause an error:failed:execution Error, return code 1 from Org.apache.hadoop.hive.ql.exec.DDLTask. Metaexception (Message:javax.jdo.JDODataStoreException:An exception is thrown while Adding/validating class (es): Ora-01754:a table contain only one column of type LONG
Workaround: Open the Hive-metastore-0.13.0.jar in ${hive_home}/lib with the Unzip tool and discover a file named Package.jdo. Open the file and locate the following content.<field name= "vieworiginaltext" default-fetch-group= "false" ><column name= "View_original_text" jdbc-type= "longvarchar"/></field><field name= "viewexpandedtext" default-fetch-group= "false" ><column name= "View_expanded_text" jdbc-type= "longvarchar"/></field>The ability to discover the types of columns View_original_text and View_expanded_text is LongVarChar, corresponding to long in Oracle, which contradicts the requirement that Oracle tables only have a column of type long. So there was a mistake.


Change the value of the jdbc-type of the two columns to CLOB according to the recommendations of the Hive website. The contents of the changes, as seen below. <field name= "Vieworiginaltext" default-fetch-group= "false" > <column name= "View_original_text" jdbc-type= " CLOB "/> </field> <field name= "Viewexpandedtext" default-fetch-group= "false" > <column name= "View_expanded_text" jdbc-type= " CLOB "/> </field>
After the change, restart hive.


Run the command to create the table again. CREATE TABLE succeeded:
hive> CREATE TABLE Searchlog (Time string,id string,sword string,rank int,clickrank int,url string) row format Delimited fields terminated by ' \ t ' lines terminated by ' \ n ' stored as textfile; OKTime taken:0.986 seconds
load the local data into the table:hive> Load data local inpath '/home/fulong/downloads/sogouq.reduced ' overwrite into table searchlog; Copying data from file:/home/fulong/downloads/sogouq.reducedCopying file:file:/home/fulong/downloads/sogouq.reducedLoading data to table Default.searchlogrmr:DEPRECATED:Please use ' rm-r ' instead.Deleted Hdfs://fulonghadoop/user/hive/warehouse/searchlogTable default.searchlog Stats: [Numfiles=1, Numrows=0, totalsize=152006060, rawdatasize=0]OKTime taken:25.705 seconds
View all tables:hive> show tables; OKSearchlogTime taken:0.139 seconds, fetched:1 row (s)
Count rows:hive> Select COUNT (*) from Searchlog; Total jobs = 1launching Job 1 out of 1Number of reduce tasks determined at compile time:1In order to change the average load for a reducer (in bytes):Set hive.exec.reducers.bytes.per.reducer=<number>In order to limit the maximum number of reducers:Set hive.exec.reducers.max=<number>In order to set a constant number of reducers:Set mapreduce.job.reduces=<number>Starting Job = job_1407233914535_0001, Tracking URL = http://FBI003:8088/proxy/application_1407233914535_0001/ Kill Command =/home/fulong/hadoop/hadoop-2.2.0/bin/hadoop Job-kill job_1407233914535_0001Hadoop Job information for Stage-1: number of mappers:1; number of reducers:12014-08-20 18:03:17,667 Stage-1 map = 0, reduce = 0%2014-08-20 18:04:05,426 Stage-1 map = 100%, reduce = 0, Cumulative CPU 3.46 sec2014-08-20 18:04:27,317 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 4.74 secMapReduce Total Cumulative CPU time:4 seconds 740 msecEnded Job = job_1407233914535_0001MapReduce Jobs Launched:Job 0:map:1 reduce:1 Cumulative cpu:4.74 sec HDFs read:152010455 HDFs write:8 SUCCESSTotal MapReduce CPU time spent:4 seconds 740 msecOK1724264Time taken:103.154 seconds, fetched:1 row (s)





Copyright notice: This article blog original article. Blogs, without consent, may not be reproduced.

"Gandalf" Hive 0.13.1 on Hadoop2.2.0 + oracle10g deployment Detailed explanation

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.