"Go" SOLR import data from a database (DIH)

Source: Internet
Author: User
Tags solr

This article transferred from: http://blog.csdn.net/xiaoyu714543065/article/details/11849115

I. Data import (DATAIMPORTHANDLER-DIH)

DIH is a toolkit provided by SOLR for importing databases, xml/http, and rich text objects into the SOLR index library. Only the database is introduced here.

A. Prepare the following JAR packages

Apache-solr-dataimporthandler-4.0.0.jar

Apache-solr-dataimporthandler-extras-4.0.0.jar

Apache-solr-dataimportscheduler-1.1.jar (incremental import use)

The JDBC driver package for the database is used by Oracle Oracle10g.ja into Tomcat6.0.36/webapps/sol/web-inf/lib

B, Configuration Solrconfig.xml

Add the following configuration to the Solrconfig.xml:

<requesthandlername= "/dataimport" class= "Org.apache.solr.handler.dataimport.DataImportHandler" >

<lst name= "Defaults" >

<str name= "config" >xx-data-config.xml</str>

</lst>

</requestHandler>

C. Configure the data source

Establish the Xx-data-config.xml file in the above configuration in the same directory as the Solrconfig.xml file, configured as follows

The query property is used when full import is in use. Others are used for incremental imports.

<?xml version= "1.0″encoding=" utf-8″?>

<dataConfig>

<datasource type= "Jdbcdatasource"

Driver= "Oracle.jdbc.driver.OracleDriver"

Url= "JDBC:ORACLE:THIN:@192.168.0.129:1521:ORCL"

User= "username"

password= "Password"/>

<document>

<entity name= "Business_info" pk= "ID"

query= "Select T.id id,business_name,bussiness_type from Business T"

deltaimportquery= "Select T.id id,business_name,bussiness_type from Business t where id= ' ${dataimporter.delta.id} '"

deltaquery= "Select T.id id,business_name,bussiness_type from the business T where To_char (UpdateTime, ' Yyyy-mm-dd hh24:mi: SS ') > ' ${dataimporter.last_index_time} ' >

<field column= "id" name= "id"/>

</entity>

</document>

</dataConfig>

Now that all DIH configuration is complete, enter the command in the browser:

Full import:

Http://localhost:8085/solr/core0/dataimport?command=full-import&commit=ture

Incremental import:

Http://localhost:8085/solr/core0/dataimport?command=delta-import&clean=false&commit=ture

View import Status

Http://localhost:8085/solr/core0/dataimport?command=status

D. Processing the Clob field

<entity name= "meta" query= "select Id,filename,content,bytes from Documents" transformer= "Clobtransformer" >

<field column= "id" name= "id"/>

<field column= "Content" name= "Content" clob= "true"/>

</entity>

The column of the ClOB field must be capitalized!!

E, Dih Memory overflow error

When using Dih, it is easy to report a memory overflow error. Can be resolved by setting the JVM size. The Setup method is as follows:

In Tomcat\bin\startup.bat add set java_opts=-xms128m-xmx1024m configuration here is 1024M, according to the situation can increase the amount of

F, automatic full import and automatic incremental import

This feature can be implemented on its own, or can be done with the Apache-solr-dataimportscheduler-1.0.jar package. The configuration is as follows:

Modify the Web-inf/web.xml in Solr.war and increase it before the servlet node:

<listener>
<listener-class>
Org.apache.solr.handler.dataimport.scheduler.ApplicationListener
</listener-class>
</listener>

Remove the dataimport.properties from the Apache-solr-dataimportscheduler-.jar and modify it according to the actual situation and put it on solr.home/conf (not solr.home/core/ conf) directory below

Specific configuration can be consulted: http://code.google.com/p/solr-dataimport-scheduler/

"Go" SOLR import data from a database (DIH)

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.