Elasticsearch-hadoop Usage Records

Source: Internet
Author: User
Tags hadoop fs log4j

Elasticsearch-hadoop is a project that integrates Hadoop and elasticsearch in depth, and is also a subproject maintained by ES officials, by implementing input and output between Hadoop and Es, You can read and write data from the ES cluster in Hadoop, giving full play to the benefits of map-reduce parallel processing, and bringing real-time search possibilities to Hadoop data.
Project website: http://www.elasticsearch.org/overview/hadoop/

Operating Environment:
CDH4, ElasticSearch0.90.2

Http://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH4/latest/CDH4-Quick-Start/cdh4qs_topic_3_3.html

Https://github.com/medcl/elasticsearch-rtf

Interop for Hive and es:
#安装, add the Elasticsearch-hadoop jar path inside Hive
#下载hadoop-es jar Package, Https://download.elasticsearch.org/hadoop/hadoop-latest.zip

#Hive加载的JAR路径为本地路径

[Medcl@Node-1 ~]$lsElasticsearch-hadoop-1.3.0.m1.jar[Medcl@Node-1 ~]$pwd
/Home/Medcl[Medcl@Node-1 ~]$ hive-hiveconf hive.aux.jars.path=/Home/Medcl/Elasticsearch-hadoop-1.3.0.m1.jar Logging initialized using configurationinch file:/etc/Hive/Conf.dist/Hive-log4j.properties Hive Historyfile=/Tmp/Medcl/Hive_job_log_94db3616-e210-4aab-b07b-6fb159e217ec_1758848920.txt

#ElasticSearch集群名为 "Elasticsearch", and Hadoop on a machine

#Hive里面创建一个Table (user), and use Hadoop-elasticsearch to correlate an index (/index/user), 2 fields, ID, and name

CREATE EXTERNAL TABLE user  (ID INT, name String,site STRING)
STORED by ' Org.elasticsearch.hadoop.hive.ESStorageHandler '
tblproperties(' es.resource ' = ' index/user/',
              ' es.index.auto.create ' = ' true ')
Operation under MEDCL: CREATE EXTERNAL TABLE User(IDINT, name STRING)STORED by ' Org.elasticsearch.hadoop.hive.ESStorageHandler ' tblproperties(' Es.resource ' = '/index/user/', ' es.index.auto.create ' = ' true '); Failed:execution Error,returnCode 1 from Org.apache.hadoop.hive.ql.exec.DDLTask Hive>CREATE EXTERNAL TABLE User(IDINT, name STRING)
    >STORED by ' Org.elasticsearch.hadoop.hive.ESStorageHandler '>Tblproperties(' Es.resource ' = ' medcl/',>' Es.index.auto.create ' = ' false '); Failed:errorinchMetadata:metaexception(Message:got exception:org.apache.hadoop.security.AccessControlException Permission DENIED:USER=MEDCL, Access=write , inode= "/user": Hdfs:supergroup:drwxr-xr-x #擦, see permissions[Medcl@Node-1 ~]$ Hadoop FS-LSR/Lsr:DEPRECATED:Please use ' ls-r ' instead. Drwxrwxrwt-hdfs SuperGroup 0 2013-12-16 22:19/TMP drwxr-xr-x-hdfs supergroup 0 2013-12-16 22:25/User Drwxr-xr-x-MEDCL supergroup 0 2013-12-17 00:30/User/MEDCL drwxr-xr-x-MEDCL supergroup 0 2013-12-16 22:32/User/Medcl/input-rw-r--r--1 MEDCL supergroup 2801897 2013-12-16 22:32/User/Medcl/Input/File1.txt drwxr-xr-x-MEDCL supergroup 0 2013-12-17 00:30/User/Medcl/lib-rw-r--r--1 MEDCL supergroup 160414 2013-12-17 00:30/User/Medcl/Lib/Elasticsearch-hadoop-1.3.0.m1.jar drwxr-xr-x-hdfs supergroup 0 2013-12-16 22:20/var drwxr-xr-x-hdfs supergroup 0 2013-12-16 22:20/Var/Lib #原来user目录权限是hdfs, OK, switch Hdfs,jar also change the location of the HDFs user can access, ON/tmp bar[Root@Node-1 MEDCL]# CP elasticsearch-hadoop-1.3.0.m1.jar/tmp/[Root@Node-1 MEDCL]# ^c[Root@Node-1 MEDCL]# sudo-u HDFs hive-hiveconf hive.aux.jars.path=/tmp/elasticsearch-hadoop-1.3.0.m1.jar Logging initialized using Configurationinch file:/etc/Hive/Conf.dist/Hive-log4j.properties Hive Historyfile=/Tmp/Hdfs/Hive_job_log_bdad4d7a-f929-43d7-a56e-e026fdd7e3b4_1219802521.txt Hive>CREATE EXTERNAL TABLE User(IDINT, name STRING)
    >STORED by ' Org.elasticsearch.hadoop.hive.ESStorageHandler '>Tblproperties(' Es.resource ' = '/index/user/',>' Es.index.auto.create ' = ' false '); 2013-12-16 17:09:29.560 GMT Thread[Main,5,main]Java.io.FileNotFoundException:derby.log(Permission denied)----------------------------------------------------------------2013-12-16 17:09:29.877 gmt:booting Derby version The Apache software Foundation-apache Derby-10.4.2.0-(689064): instance a816c00e-0142-fc62-4b5c-000000cec758 on database directory/Var/Lib/Hive/Metastore/metastore_dbinchREAD only mode Database Class Loader started-derby.database.classpath= "Failed:errorinchMetadata:java.lang.RuntimeException:Unable to instantiate Org.apache.hadoop.hive.metastore.HiveMetaStoreClient Failed:execution Error,returnCode 1 from Org.apache.hadoop.hive.ql.exec.DDLTask #ok, kill lock[Root@Node-1 ~]# ls/var/lib/hive/metastore/metastore_db Dbex.lck db.lck log seg0 service.properties tmp[Root@Node-1 ~]# Rm/var/lib/hive/metastore/metastore_db/dbex.lckRM: Remove Regularfile `/Var/Lib/Hive/Metastore/metastore_db/Dbex.lck '? y [root@node-1 ~]# rm/var/lib/hive/metastore/metastore_db/db.lck rm:remove regular file '/var/lib/hive/metastore/ Metastore_db/db.lck '? Y #另外忘记关另外一个hive实例了, no wonder.[Root@NODE-1 tmp]# Ps-aux|grep Hive Warning:bad syntax, perhaps a bogus '-'? See/Usr/Share/Doc/procps-3.2.8/FAQ root 10855 0.0 0.1 148024 2064 pts/0 s+ 01:09 0:00sudo-U hdfs hive-hiveconf hive.aux.jars.path=/Tmp/Elasticsearch-hadoop-1.3.0.m1.jar HDFs 10856 1.8 5.7 858344 109892 pts/0 sl+ 01:09 0:06/Usr/Lib/Jvm/Java-openjdk/Bin/Java-xmx256m-dhadoop.log.dir=/Usr/Lib/Hadoop/Logs-dhadoop.log.file=hadoop.log-dhadoop.home.dir=/Usr/Lib/Hadoop-dhadoop.id.str=-dhadoop.root.logger=info,console-djava.library.path=/Usr/Lib/Hadoop/Lib/native-dhadoop.policy.file=hadoop-policy.xml-djava.net.preferipv4stack=true-dhadoop.security.logger=info,nullappender Org.apache.hadoop.util.RunJar/Usr/Lib/Hive/Lib/Hive-cli-0.10.0-cdh4.5.0.jar org.apache.hadoop.hive.cli.clidriver-hiveconf hive.aux.jars.path=/Tmp/Elasticsearch-hadoop-1.3.0.m1.jar #权限问题[Root@NODE-1 tmp]# ll/var/lib/hive/metastore/metastore_db/total Drwxrwxr-x 2 medcl medcl 4096 Dec 00:56 log drwxrwxr-x 2 MEDCL Medc L 4096 Dec 00:56 seg0-rw-rw-r--1 MEDCL MEDCL 860 Dec-00:56 service.properties drwxrwxr-x 2 MEDCL medcl 4096 Dec 1 7 01:01 tmp[Root@NODE-1 tmp]# sudo-u HDFs hive-hiveconf hive.aux.jars.path=/tmp/elasticsearch-hadoop-1.3.0.m1.jar^c[Root@NODE-1 tmp]# chmod 777/var/lib/hive/metastore/metastore_db/-R[Root@NODE-1 tmp]# sudo-u HDFs hive-hiveconf hive.aux.jars.path=/tmp/elasticsearch-hadoop-1.3.0.m1.jar Logging initialized using Configurationinch file:/etc/Hive/Conf.dist/Hive-log4j.properties Hivehistory<

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.