Full-Text Indexing----SOLR Server update full-scale indexes

Last Update:2016-05-12 Source: Internet

Author: User

Tags solr

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

after the SOLR indexing is set up, the indexes need to be updated in a timely manner based on changes in the database, and there are two ways to update the indexes, full updates and incremental updates. As the name implies, a full-volume update deletes all indexes on the SOLR server and then re-imports the data, and the incremental index updates only the modified data, and this article describes the full-scale index update.
A configuration data source
1.1 Database

We use a single table as the test data source, including three fields, Id,title,content, easy to test, using varchar as the primary key data type. The structure is as follows:

1.2 Configuring Data-config.xml

The data source configuration content is as follows:

<pre name= "code" class= "HTML" ><span style= "font-size:18px;" ><dataConfig>    <datasource name= "Jfinal_demo" type= "Jdbcdatasource" driver= "Com.mysql.jdbc.Driver" Url= "Jdbc:mysql://192.168.21.20:3306/jfinal_demo" user= "root" password= "123456" batchsize= "-1"/> <document Name= "Testdoc" ><entity name= "blog" datasource= "Jfinal_demo" pk= "id" query= "SELECT * From Blog" ><field column= "id" name= "id"/><field column= "title" Name= "title"/><field column= "content" name= "content"/> </entity>    </document></dataConfig></span>

1.3 Configuring Schema.xml

The index file is configured as follows:

<span style= "FONT-SIZE:18PX;" ><pre name= "code" class= "HTML" ><field name= "id" type= "text_general" indexed= "true" stored= "true"/> <field name= "title" Type= "Text_general" indexed= "true" stored= "true"/><field name= "content" type= "Text_ General "indexed=" true "stored=" true "/></span>

II using the SOLR Admin client to update the index

The 2.1 update operations are as follows:

2.2 Testing

Description: Use SOLR Admin client mode, simple, fast, intuitive, suitable for data testing.

Three update indexes with HTTP requests

3.1 Principle we know that all SOLR operations will eventually be converted to an HTTP GET request to access the server, so we can imitate the client to update the index directly through an HTTP request.

3.2 implementation

This article uses the HttpURLConnection object to complete the HTTP request with the following code:

<span style= "FONT-SIZE:18PX;" ><span style= "White-space:pre" ></span>/** * Access URL, full index */public static Boolean Runhttpget () {Boolean          Flag = false;//Sets the path of the request String strurl= "Http://192.168.22.216:8983/solr/dataimport?command=full-import";              The requested parameter is UTF-8 encoded and converted to a byte array = try {//Create a URL object url url=new url (strurl);              Open a HttpURLConnection connection httpurlconnection urlconn= (httpurlconnection) url.openconnection ();              Sets the time of the connection timeout urlconn.setdooutput (true);              When using a POST request, the settings cannot use the cache Urlconn.setusecaches (false);              Set the request for POST request Urlconn.setrequestmethod ("GET");              Urlconn.setinstancefollowredirects (TRUE);              Configuration Request Content-type Urlconn.setrequestproperty ("Content-type", "Application/json, Text/javascript");              Perform connection operation Urlconn.connect (); Send the requested parameter DataOutputStream Dos=new DataOutputStream (Urlconn.getoutputstream ());            Dos.flush ();                        Dos.close ();                if (Urlconn.getresponsecode () ==200) {flag = true;                Show InputStreamReader ISR = new InputStreamReader (Urlconn.getinputstream (), "utf-8");                     int i;                     String strresult = "";                Read while ((i = Isr.read ())! =-1) {strresult = strresult + (char) i;                }//system.out.println (Strresult.tostring ());                 Isr.close ();          }} catch (Exception e) {e.printstacktrace ();    } return flag; }</span>

of course, we in the actual use, will be timed to complete the index update, so we can do task scheduling through quartz, here no longer demonstrates code, interested readers can be completed according to the actual situation.

3.3 Testing

Description: This method is simple, logic clear, but the implementation is slightly complex, and requires additional programs and resources to implement the function, so want to use this method of children's shoes need to be mentally prepared.
Four use official scheduler to implement index update
4.1 Overview
SOLR officially provides a powerful data Import request Handler, while providing the scheduler, the example scheduler only support incremental index, does not support the regular full-scale index, the user has been modified to increase the full-scale index timer. I am here only to do the introduction, blog address:
The references are as follows:
4.2 Jar Package Configuration

Bring Apache-solr-dataimportscheduler-1.0.jar and SOLR to the Apache-solr-dataimporthandler-.jar, Apache-solr-dataimporthandler-extras-.jar put it under the Solr.war Lib directory.

4.3 Configuring Web. xml

Modify the Web. XML in Solr.war to increase the money surface at the servlet node:

<span style= "FONT-SIZE:18PX;" >    <listener>        <listener-class>            Org.apache.solr.handler.dataimport.scheduler.ApplicationListener        </listener-class>    </listener ></span>

4.4 Configuring the index update file

Remove the dataimport.properties from the Apache-solr-dataimportscheduler-.jar and modify it according to the actual situation and put it on solr.home/conf (not solr.home/core/ conf) directory below

Dataimport.properties Configuration Item Description:

<span style= "FONT-SIZE:18PX;" >##################################################                                                  ##       DataImport Scheduler properties          ##                                                 ###################################### ############ #  to sync or not to sync#  1-active; Anything else-inactivesyncenabled=1 #  which cores to schedule#  in a multi-core environment can deci De which CORes want syncronized#  leave empty or comment it out if using Single-core deploymentsynccores=core1,core2 #& nbsp SOLR server name or IP address#  [defaults to localhost if empty]server=localhost #  SOLR server port#&nbsp ; [Defaults to + if empty]port=8080 #  application name/context#  [defaults to current Servletcontextlistener ' s context (APP) name]webapp=solr #  URL params [mandatory]#  remainder of urlparams=/dataimport?command=delta-import&clean=false&commit=true #  Schedule interval#  Number of minutes between-runs#  [defaults to If empty]interval=1 #  re-indexing interval, per minute, default 7200, 5 days; #  is empty, 0, or commented out: A parameter that will never be re-indexed rebuildindexinterval=7200 #  the index Rebuildindexparams=/dataimport?command =full-import&clean=true&commit=true #  The timing start time of the redo index interval, the first real execution time =rebuildindexbegintime+ rebuildindexinterval*60*1000;#  two formats: 2012-04-11 03:10:00 or   03:10:00, the latter one will automaticallyThe completion date section is the date when the service was started rebuildindexbegintime=03:10:00</span>

4.5 Restarting the SOLR server

4.6 The third method is simpler and does not require additional program support, so it is recommended.

Five summary

The whole index is straightforward, but when the data volume is large, the system needs to consume too much IO resources, so it is necessary to set the update interval of the index large, which may cause the data to be out of sync in a short time, but this will affect the user experience, the time-sensitive system does not recommend the use of full-scale index method.

Full-Text Indexing----SOLR Server update full-scale indexes

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More