Build a two-level index with HBase Indexer (consolidated with the latest version of HBase1.2.6 and SOLR 7.2.1)

Source: Internet
Author: User
Tags solr solr query zookeeper log4j

This period of time integration HBase, need to establish two index for hbase, convenient data query use, SOLR authoritative guide has HBase and SOLR integration chapters, follow the book and the instructions on the web is very close to the configuration success, HBase Indexer has not been updated for more than 1 years, Integrated with the latest hbase1.2.6,solr7.2.1 there are a lot of related interfaces that are sending changes

1. Download the Hbaseindexer project:

Official website: http://ngdata.github.io/hbase-indexer/

Github:https://github.com/ngdata/hbase-indexer

Wiki (Configuration Note): Https://github.com/NGDATA/hbase-indexer/wiki

Sep Tools CLI Description (Status monitoring): Https://github.com/NGDATA/hbase-indexer/tree/master/hbase-sep/hbase-sep-tools

2. Modify the root directory pom.xml after the download is complete

The configuration version adjusts to the HBase and SOLR versions that we need, version.solr.mapreduce and Httpclient.core are the version numbers that you add, and you need to modify the corresponding location within the POM because the two packages are not synchronized with the version of the main jar:

  <properties> <version.solr>7.2.1</version.solr> <version.solr.mapreduce>6.5.1</versio N.solr.mapreduce> <version.guava>12.0.1</version.guava> <version.joda-time>1.6</ Version.joda-time> <version.slf4j>1.7.7</version.slf4j> <version.hbase>1.2.6</ Version.hbase> <version.hadoop>2.7.4</version.hadoop> <version.zookeeper>3.4.6</ Version.zookeeper> <version.jackson>1.9.13</version.jackson> <!--version.httpclient>4.3< /version.httpclient-<version.httpclient>4.5.3</version.httpclient> <version.httpclient.core >4.4.6</version.httpclient.core> <!--<version.kite>0.13.0</version.kite>-<vers Ion.kite>0.15.0</version.kite> <version.jersey>1.17</version.jersey> <version.surefir E.plugin>2.19.1</version.surefire.plugin> <version.failsafe.plugin>${version.surefire.plugin}</version.failsafe.plugin> <!--Tells maven plugins what fi Le encoding to, <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding> </proper Ties>
Add plugins for easy extraction of Jars:

	  <plugin>  
            <groupId>org.apache.maven.plugins</groupId>  
            <artifactId> maven-dependency-plugin</artifactid>  
			<configuration>
				<outputdirectory>${ Project.build.directory}</outputdirectory>
				<excludeTransitive>false</excludeTransitive>
			</configuration>
        </plugin>  

3. Import the project into Eclipse (the compilation issue that comes after the version upgrade is processed)
Note that: There are several plug-ins in pom.xml M2eclipse plug-ins are not compatible, need to ctrl+i ignore off: Otherwise it will prompt lifecycle problems

        <groupId>org.apache.maven.plugins</groupId>
        <artifactid>maven-antrun-plugin</ Artifactid>
        <versionRange>1.6</versionRange>

<groupId>org.apache.maven.plugins</groupId>
        <artifactid>maven-dependency-plugin</ Artifactid>
        <versionRange>2.8</versionRange>
If you want to run the project under Eclipse, you need to manually do the logic of these plugins:

(1). Replace the version number within the Hbase-indexer-default.xml

(2). Generate the Hbase-indexer-common\src\main\java\com\ngdata\hbaseindexer\package-info.java file:

@VersionAnnotation (version= "1.6-snapshot", revision= "Unknown",
                         user= "root", date= "Thu Feb  1 21:15:10 CST 2018 ", url=" File:///E:/work/windtrend/hbase-indexer/hbase-indexer-common ") package
com.ngdata.hbaseindexer;
When the server starts, it detects the version within the @VersionAnnotation (3). Modify the appropriate compilation problem
4. After the modification is complete, upload the entire directory to Linux, run mvn to compile, (Maven compiles the shell script, you need to compile the Pom.xml file under Windows) (1). MVN clean install-e-dskiptests (2). MVN dependency:copy-dependencies #抽取jar包 (3). mkdir Lib (4). Find/-type F -iname "*.jar"-exec cp {} lib/\; #抽取jar包 (5). RM-RF Lib/*-sources.jar (6). Copy the bin,conf,lib in the directory to a new folder such as: hbase-indexer-1.6 (7). Copy hbase-indexer-1.6 to the deployment environment Note: The project-dependent Ua-parser package is not found within the MAVEN repository and needs to be downloaded Ua-parser compiled and uploaded to the Nexus Https://github.com/ua-parser /uap-java
5. Configure Hbase-indexer (1). Configure Environment variables: (/etc/profile) export hbase_indexer_home=xxx/hbase-indexer-1.6 (2). VI conf/ hbase-indexer-env.sh
Export hbase_indexer_heapsize=1024 export
hbase_indexer_log_dir= $HBASE _indexer_home/logs
export Hbase_ indexer_pid_dir= $HBASE _indexer_home/pid
export Hbase_indexer_cli_zk=master,slave1,slave2,slave3
Other configurations can be modified as appropriate, such as remote debugging:
Export hbase_indexer_opts= "$HBASE _indexer_opts-xdebug-xrunjdwp:transport=dt_socket,server=y,suspend=n,address= 8075 "

(3). Configuration Log4j.xml, log4j.properties problems within the project could not be loaded (4). New Order-indexer.xml
<?xml version= "1.0"?>
<indexer table= "T_order" unique-key-field= "id" >
  <field name= "number" Value= "F:number" type= "string"/>
  <field name= "source" value= "F:source" type= "int"/>
  <field Name = "TenantId" value= "F:tenantid" type= "int"/>
  <field name= "userId" value= "F:userid" type= "int"/>
  <field name= "StoreId" value= "F:storeid" type= "int"/>
  <field name= "StoreName" value= "F:storename" type= " String "/>
  <field name=" Storenumber "value=" F:storenumber "type=" string "/> <field
  name=" UserName "value=" F:username "type=" string "/>
</indexer>
(5). New Order-schema.xml (Find a default schema modification within SOLR)
  <field name= "Number" type= "string" indexed= "true" stored= "true"/>
  <field name= "source" type= "int"  Indexed= "true" stored= "true"/>
  <field name= "tenantId" type= "int"  indexed= "true" stored= "true"/>
  <field name= "userId" type= "int"  indexed= "true" stored= "true"/>
  <field name= "storeId" type= " int "  indexed=" true "stored=" true "/>
  <field name=" StoreName "type=" string "  indexed=" true "stored = "true"/>
  <field name= "Storenumber" type= "string"  indexed= "true" stored= "true"/>
  < Field name= "UserName" type= "string"  indexed= "true" stored= "true"/>

6. Configure SOLR (SOLR must be cloud mode, see the code can also be Classic mode, but the parameters configured in Classic mode are all solr.shard.xxx)

(1). Configure SOLR Cloud

(2). Create collection Order

BIN/SOLR create-c Order
(3). Upload schema:

./solr-7.2.1/server/scripts/cloud-scripts/zkcli.sh-z Localhost:9983-cmd Putfile/configs/order/managed-schema $ Hbase_indexer_home/index/order-schema.xml
Note: ZKCLI is a script provided by SOLR


(4) Configure Solrconfig Soft commit:

./solr-7.2.1/server/scripts/cloud-scripts/zkcli.sh-z Localhost:9983-cmd  getfile/configs/order/ Solrconfig.xml $HBASE _indexer_home/index/solrconfig.xml
To modify the configuration and upload:

    <autoSoftCommit>
      <maxTime>5000</maxTime>
      <maxDocs>100</maxDocs>  
    < /autosoftcommit>
(5) Reload Collection


7. Configure HBase:

(1). Copy jar package to hbase Lib directory

CP./lib/hbase-sep* $HBASE _home/lib/
(2). Turn on Replication,hbase-site.xml to add the following configuration:
	<property>
		<name>hbase.replication</name>
		<value>true</value>
	</ property>
	<property>
		<name>replication.source.ratio</name>
		<value>1.0 </value>
	</property>
	<property>
		<name>replication.source.nb.capacity</ name>
		<value>1000</value>
	</property>
	<property>
		<name> Replication.replicationsource.implementation</name>
		<value> Com.ngdata.sep.impl.sepreplicationsource</value>
	</property>
(3). Copy to each node and restart HBase

8. Run:

(1). Run hbase-indexer./bin/hbase-indexer Server
(2). Add Indexer

./bin/hbase-indexer add-indexer \
                      --name orderindexer \
                      --indexer-conf index/order-indexer.xml \
                      --CP solr.zk=master:9983 \
                      --CP Solr.collection=order

(3) Create a table within HBase

$ hbase Shell
hbase> create ' T_order ', {NAME = ' f ', replication_scope = ' 1 '}
hbase> put ' T_order ', ' Row1 ', ' F:userid ', 10000
hbase> put ' t_order ', ' row2 ', ' F:userid ', 10000
(4), enter SOLR query to see the results

Note:

There is no way to use the Windows environment under the server,windows there is no way to accept the sepevent, debugging needs under Linux or open remote debug,hbase-indexer-evn.sh parameters can be opened;



Several core processing classes:

Event Entry:

Com.ngdata.sep.impl.SepEventExecutor.scheduleEventBatch (int partition, list<sepevent> events)

Data Processing entry:

Com.ngdata.hbaseindexer.indexer.indexer$rowbasedindexer

Indexer com.ngdata.hbaseindexer.mr.HBaseIndexerMapper.createIndexer (String indexname, context context, indexerconf indexerconf, String tableName, Resulttosolrmapper Mapper, map<string, string> indexconnectionparams) throws IOException, Sharderexception



Reference:

Official website: http://ngdata.github.io/hbase-indexer/

Github:https://github.com/ngdata/hbase-indexer

Wiki (Configuration Note): Https://github.com/NGDATA/hbase-indexer/wiki

Sep Tools CLI Description (Status monitoring): Https://github.com/NGDATA/hbase-indexer/tree/master/hbase-sep/hbase-sep-tools


Http://www.niuchaoqun.com/14543825447680.html
http://blog.csdn.net/d6619309/article/details/51500368















Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.