Grouping and paging the results using the Java Lucene Search tool

Grouping and paging the results using the Java Lucene Search tool _java

Last Update:2017-01-19 Source: Internet

Author: User

Tags create index solr

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Use Groupingsearch to group search results
Package org.apache.lucene.search.grouping Description

This module can be used to group the search results of Lucene, the specified single domain is aggregated. For example, a document with the same author field value is grouped according to the "Author" field.

You need to enter some of the necessary information when you are grouping:

1, GroupField: Grouped according to this domain. For example, if you use the "Author" field for grouping, the books in each group are the same author. Documents without this domain will be divided into a separate group.

2, Groupsort: Group sorting.

3, Topngroups: How many groups to retain. For example, 10 indicates that only the first 10 groups are retained.

4, Groupoffset: To row in front of which group of groups to retrieve. For example, 3 means to return 7 groups (assuming opngroups equals 10). Useful in pagination, such as displaying only 5 groups per page.

5, Withingroupsort: In the group document sorting. Note: The difference between here and Groupsort

6, Withingroupoffset: For each of the groups in which row in front of the document to retrieve.

Using Groupingsearch to group search results is simpler

Introduction to Groupingsearch API documentation:

Convenience class to perform grouping in a non distributed environment.

Grouping in non-distributed environments

The Warning:this API is experimental and might to the incompatible ways in the next release.

The 4.3.1 version is used here.

Some important methods are:

groupingsearch:setcaching (int maxdocstocache, Boolean cachescores) cache
GROUPINGSEARCH:SETCACHINGINMB (Double maxcacherammb, Boolean cachescores) caches the first search results for a second search
Groupingsearch:setgroupdocslimit (int groupdocslimit) Specifies the number of documents returned per group, and returns a document by default when not specified
Groupingsearch:setgroupsort (sort groupsort) specify group sorting

Sample code:

1. First look at the code to build the index

public class Indexhelper {private document document;
  Private Directory directory;
 
  Private IndexWriter IndexWriter;
    Public Directory getdirectory () {directory= (directory==null)? New Ramdirectory ():d irectory;
  return directory;
  Private Indexwriterconfig GetConfig () {return new Indexwriterconfig (version.lucene_43, New Ikanalyzer (true));
    Private IndexWriter Getindexwriter () {try {return new IndexWriter (Getdirectory (), GetConfig ());
      catch (IOException e) {e.printstacktrace ();
    return null; } public Indexsearcher Getindexsearcher () throws IOException {return new Indexsearcher (Directoryreader.open (g
  Etdirectory ())); /** * Create index for Group Test * @param author * @param content/public void Createindexforgroup (
    int id,string author,string content) {IndexWriter = Getindexwriter ();
    Document = new document ();
    Document.add (New Intfield ("id", ID, Field.Store.YES)); DocumEnt.add (New Stringfield ("Author", author, Field.Store.YES));
    Document.add (New TextField ("content", content, Field.Store.YES));
      try {indexwriter.adddocument (document);
      Indexwriter.commit ();
    Indexwriter.close ();
    catch (IOException e) {e.printstacktrace ();
 }
  }
}

2. Grouping:

public class Grouptest public void Group (Indexsearcher indexsearcher,string groupfield,string content) throws Ioexceptio
    N, parseexception {groupingsearch groupingsearch = new Groupingsearch (GroupField);
    Groupingsearch.setgroupsort (New Sort (Sortfield.field_score));
    Groupingsearch.setfillsortfields (TRUE);
    GROUPINGSEARCH.SETCACHINGINMB (4.0, True);
    Groupingsearch.setallgroups (TRUE);
    Groupingsearch.setallgroupheads (TRUE);
 
    Groupingsearch.setgroupdocslimit (10);
    Queryparser parser = new Queryparser (version.lucene_43, "content", new Ikanalyzer (true));
 
    Query query = parser.parse (content);
 
    topgroups<bytesref> result = Groupingsearch.search (indexsearcher, query, 0, 1000);
    System.out.println ("Number of search Hits:" + Result.totalhitcount);
 
    System.out.println ("Group of Search results:" + result.groups.length);
    Document document; for (groupdocs<bytesref> groupDocs:result.groups) {System.out.println ("group:" + groupDocs.groupValue.utf8ToSt Ring ());
 
      System.out.println ("Group records:" + groupdocs.totalhits);
      System.out.println ("groupDocs.scoreDocs.length:" + groupDocs.scoreDocs.length);
      for (Scoredoc ScoreDoc:groupDocs.scoreDocs) {System.out.println (Indexsearcher.doc (Scoredoc.doc));

 }
    }
  }

3. Simple test:

public static void Main (string[] args) throws IOException, parseexception {indexhelper
    indexhelper = new Indexhelper ( );
    Indexhelper.createindexforgroup (1, "Sweet potato", "open source China");
    Indexhelper.createindexforgroup (2, "Sweet potato", "open source community");
    Indexhelper.createindexforgroup (3, "Sweet Potato", "code Design");
    Indexhelper.createindexforgroup (4, "Sweet Potato", "design");
    Indexhelper.createindexforgroup (5, "Sleep First", "Lucene development");
    Indexhelper.createindexforgroup (6, "Sleep First", "Lucene Combat");
    Indexhelper.createindexforgroup (7, "Sleep First", "Open source Lucene");
    Indexhelper.createindexforgroup (8, "Sleep First", "Open source Solr");
 
    Indexhelper.createindexforgroup (9, "scattered cents", "scattered xian open source Lucene");
    Indexhelper.createindexforgroup (10, "scattered cents", "scattered xian open source Solr");
    Indexhelper.createindexforgroup (11, "scattered cents", "open Source");
    Grouptest grouptest = new Grouptest ();
 
    Grouptest.group (Indexhelper.getindexsearcher (), "Author", "Open Source");
  }

4. Test results:

Two ways to page pagination
Lucene has two ways of paging:

1, directly to the search results paging, the amount of data can be used in this way, pagination code core reference:

scoredoc[] sd = XXX;
Query start record position
int begin = PageSize * (currentPage-1);
Query terminate record position
int end = Math.min (begin + PageSize, sd.length);
for (int i = begin, I < end && I <totalHits; i++) {
//code to process the search result data
}

2. Use Searchafter (...)

Lucene provides five overloaded methods that you can use to

Scoredoc after: For the last search results Scoredoc total minus 1;

Query queries: How To

int N: The number of results returned for each query, that is, the total result per page

A simple example of use:

You can use the MAP to save the necessary search results
map<string, object> resultmap = new hashmap<string, object> ();
Scoredoc after = null;
Query query = XX
topdocs td = Search.searchafter (after, query, size);
 
Gets the number of hits
resultmap.put ("num", td.totalhits);
 
scoredoc[] sd = Td.scoredocs;
for (Scoredoc scoredoc:sd) {/
/Classic search results processing
}
//Search results Scoredoc total minus 1 after
= sd[td.scoredocs.length-1]; 
//Save after for next search, that is, the next page begins 
Resultmap.put (after);
 
return resultmap;

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More