This period of time integration HBase, need to establish two index for hbase, convenient data query use, SOLR authoritative guide has HBase and SOLR integration chapters, follow the book and the instructions on the web is very close to the configuration success, HBase Indexer has not been updated for more than 1 years, Integrated with the latest hbase1.2.6,solr7.2.1 there are a lot of related interfaces that are sending changes
1. Download the Hbaseindexer project:
Official website: http://ngdata.github.io/hbase-indexer/
Github:https://github.com/ngdata/hbase-indexer
Wiki (Configuration Note): Https://github.com/NGDATA/hbase-indexer/wiki
Sep Tools CLI Description (Status monitoring): Https://github.com/NGDATA/hbase-indexer/tree/master/hbase-sep/hbase-sep-tools
2. Modify the root directory pom.xml after the download is complete
The configuration version adjusts to the HBase and SOLR versions that we need, version.solr.mapreduce and Httpclient.core are the version numbers that you add, and you need to modify the corresponding location within the POM because the two packages are not synchronized with the version of the main jar:
<properties> <version.solr>7.2.1</version.solr> <version.solr.mapreduce>6.5.1</versio N.solr.mapreduce> <version.guava>12.0.1</version.guava> <version.joda-time>1.6</ Version.joda-time> <version.slf4j>1.7.7</version.slf4j> <version.hbase>1.2.6</ Version.hbase> <version.hadoop>2.7.4</version.hadoop> <version.zookeeper>3.4.6</ Version.zookeeper> <version.jackson>1.9.13</version.jackson> <!--version.httpclient>4.3< /version.httpclient-<version.httpclient>4.5.3</version.httpclient> <version.httpclient.core >4.4.6</version.httpclient.core> <!--<version.kite>0.13.0</version.kite>-<vers Ion.kite>0.15.0</version.kite> <version.jersey>1.17</version.jersey> <version.surefir E.plugin>2.19.1</version.surefire.plugin> <version.failsafe.plugin>${version.surefire.plugin}</version.failsafe.plugin> <!--Tells maven plugins what fi Le encoding to, <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding> </proper Ties>
Add plugins for easy extraction of Jars:
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId> maven-dependency-plugin</artifactid>
<configuration>
<outputdirectory>${ Project.build.directory}</outputdirectory>
<excludeTransitive>false</excludeTransitive>
</configuration>
</plugin>
3. Import the project into Eclipse (the compilation issue that comes after the version upgrade is processed)
Note that: There are several plug-ins in pom.xml M2eclipse plug-ins are not compatible, need to ctrl+i ignore off: Otherwise it will prompt lifecycle problems
<groupId>org.apache.maven.plugins</groupId>
<artifactid>maven-antrun-plugin</ Artifactid>
<versionRange>1.6</versionRange>
<groupId>org.apache.maven.plugins</groupId>
<artifactid>maven-dependency-plugin</ Artifactid>
<versionRange>2.8</versionRange>
If you want to run the project under Eclipse, you need to manually do the logic of these plugins:
(1). Replace the version number within the Hbase-indexer-default.xml
(2). Generate the Hbase-indexer-common\src\main\java\com\ngdata\hbaseindexer\package-info.java file:
@VersionAnnotation (version= "1.6-snapshot", revision= "Unknown",
user= "root", date= "Thu Feb 1 21:15:10 CST 2018 ", url=" File:///E:/work/windtrend/hbase-indexer/hbase-indexer-common ") package
com.ngdata.hbaseindexer;
When the server starts, it detects the version within the @VersionAnnotation (3). Modify the appropriate compilation problem
4. After the modification is complete, upload the entire directory to Linux, run mvn to compile, (Maven compiles the shell script, you need to compile the Pom.xml file under Windows) (1). MVN clean install-e-dskiptests (2). MVN dependency:copy-dependencies #抽取jar包 (3). mkdir Lib (4). Find/-type F -iname "*.jar"-exec cp {} lib/\; #抽取jar包 (5). RM-RF Lib/*-sources.jar (6). Copy the bin,conf,lib in the directory to a new folder such as: hbase-indexer-1.6 (7). Copy hbase-indexer-1.6 to the deployment environment Note: The project-dependent Ua-parser package is not found within the MAVEN repository and needs to be downloaded Ua-parser compiled and uploaded to the Nexus Https://github.com/ua-parser /uap-java
5. Configure Hbase-indexer (1). Configure Environment variables: (/etc/profile) export hbase_indexer_home=xxx/hbase-indexer-1.6 (2). VI conf/ hbase-indexer-env.sh
Export hbase_indexer_heapsize=1024 export
hbase_indexer_log_dir= $HBASE _indexer_home/logs
export Hbase_ indexer_pid_dir= $HBASE _indexer_home/pid
export Hbase_indexer_cli_zk=master,slave1,slave2,slave3
Other configurations can be modified as appropriate, such as remote debugging:
Export hbase_indexer_opts= "$HBASE _indexer_opts-xdebug-xrunjdwp:transport=dt_socket,server=y,suspend=n,address= 8075 "
(3). Configuration Log4j.xml, log4j.properties problems within the project could not be loaded (4). New Order-indexer.xml
<?xml version= "1.0"?>
<indexer table= "T_order" unique-key-field= "id" >
<field name= "number" Value= "F:number" type= "string"/>
<field name= "source" value= "F:source" type= "int"/>
<field Name = "TenantId" value= "F:tenantid" type= "int"/>
<field name= "userId" value= "F:userid" type= "int"/>
<field name= "StoreId" value= "F:storeid" type= "int"/>
<field name= "StoreName" value= "F:storename" type= " String "/>
<field name=" Storenumber "value=" F:storenumber "type=" string "/> <field
name=" UserName "value=" F:username "type=" string "/>
</indexer>
(5). New Order-schema.xml (Find a default schema modification within SOLR)
<field name= "Number" type= "string" indexed= "true" stored= "true"/>
<field name= "source" type= "int" Indexed= "true" stored= "true"/>
<field name= "tenantId" type= "int" indexed= "true" stored= "true"/>
<field name= "userId" type= "int" indexed= "true" stored= "true"/>
<field name= "storeId" type= " int " indexed=" true "stored=" true "/>
<field name=" StoreName "type=" string " indexed=" true "stored = "true"/>
<field name= "Storenumber" type= "string" indexed= "true" stored= "true"/>
< Field name= "UserName" type= "string" indexed= "true" stored= "true"/>
6. Configure SOLR (SOLR must be cloud mode, see the code can also be Classic mode, but the parameters configured in Classic mode are all solr.shard.xxx)
(1). Configure SOLR Cloud
(2). Create collection Order
BIN/SOLR create-c Order
(3). Upload schema:
./solr-7.2.1/server/scripts/cloud-scripts/zkcli.sh-z Localhost:9983-cmd Putfile/configs/order/managed-schema $ Hbase_indexer_home/index/order-schema.xml
Note: ZKCLI is a script provided by SOLR
(4) Configure Solrconfig Soft commit:
./solr-7.2.1/server/scripts/cloud-scripts/zkcli.sh-z Localhost:9983-cmd getfile/configs/order/ Solrconfig.xml $HBASE _indexer_home/index/solrconfig.xml
To modify the configuration and upload:
<autoSoftCommit>
<maxTime>5000</maxTime>
<maxDocs>100</maxDocs>
< /autosoftcommit>
(5) Reload Collection
7. Configure HBase:
(1). Copy jar package to hbase Lib directory
CP./lib/hbase-sep* $HBASE _home/lib/
(2). Turn on Replication,hbase-site.xml to add the following configuration:
<property>
<name>hbase.replication</name>
<value>true</value>
</ property>
<property>
<name>replication.source.ratio</name>
<value>1.0 </value>
</property>
<property>
<name>replication.source.nb.capacity</ name>
<value>1000</value>
</property>
<property>
<name> Replication.replicationsource.implementation</name>
<value> Com.ngdata.sep.impl.sepreplicationsource</value>
</property>
(3). Copy to each node and restart HBase
8. Run:
(1). Run hbase-indexer./bin/hbase-indexer Server
(2). Add Indexer
./bin/hbase-indexer add-indexer \
--name orderindexer \
--indexer-conf index/order-indexer.xml \
--CP solr.zk=master:9983 \
--CP Solr.collection=order
(3) Create a table within HBase
$ hbase Shell
hbase> create ' T_order ', {NAME = ' f ', replication_scope = ' 1 '}
hbase> put ' T_order ', ' Row1 ', ' F:userid ', 10000
hbase> put ' t_order ', ' row2 ', ' F:userid ', 10000
(4), enter SOLR query to see the results
Note:
There is no way to use the Windows environment under the server,windows there is no way to accept the sepevent, debugging needs under Linux or open remote debug,hbase-indexer-evn.sh parameters can be opened;
Several core processing classes:
Event Entry:
Com.ngdata.sep.impl.SepEventExecutor.scheduleEventBatch (int partition, list<sepevent> events)
Data Processing entry:
Com.ngdata.hbaseindexer.indexer.indexer$rowbasedindexer
Indexer com.ngdata.hbaseindexer.mr.HBaseIndexerMapper.createIndexer (String indexname, context context, indexerconf indexerconf, String tableName, Resulttosolrmapper Mapper, map<string, string> indexconnectionparams) throws IOException, Sharderexception
Reference:
Official website: http://ngdata.github.io/hbase-indexer/
Github:https://github.com/ngdata/hbase-indexer
Wiki (Configuration Note): Https://github.com/NGDATA/hbase-indexer/wiki
Sep Tools CLI Description (Status monitoring): Https://github.com/NGDATA/hbase-indexer/tree/master/hbase-sep/hbase-sep-tools
Http://www.niuchaoqun.com/14543825447680.html
http://blog.csdn.net/d6619309/article/details/51500368