Grouping and paging the results using the Java Lucene Search tool _java

Source: Internet
Author: User
Tags create index solr

Use Groupingsearch to group search results
Package org.apache.lucene.search.grouping Description

This module can be used to group the search results of Lucene, the specified single domain is aggregated. For example, a document with the same author field value is grouped according to the "Author" field.

You need to enter some of the necessary information when you are grouping:

1, GroupField: Grouped according to this domain. For example, if you use the "Author" field for grouping, the books in each group are the same author. Documents without this domain will be divided into a separate group.

2, Groupsort: Group sorting.

3, Topngroups: How many groups to retain. For example, 10 indicates that only the first 10 groups are retained.

4, Groupoffset: To row in front of which group of groups to retrieve. For example, 3 means to return 7 groups (assuming opngroups equals 10). Useful in pagination, such as displaying only 5 groups per page.

5, Withingroupsort: In the group document sorting. Note: The difference between here and Groupsort

6, Withingroupoffset: For each of the groups in which row in front of the document to retrieve.


Using Groupingsearch to group search results is simpler

Introduction to Groupingsearch API documentation:

Convenience class to perform grouping in a non distributed environment.

Grouping in non-distributed environments

The Warning:this API is experimental and might to the incompatible ways in the next release.

The 4.3.1 version is used here.

Some important methods are:

    • groupingsearch:setcaching (int maxdocstocache, Boolean cachescores) cache
    • GROUPINGSEARCH:SETCACHINGINMB (Double maxcacherammb, Boolean cachescores) caches the first search results for a second search
    • Groupingsearch:setgroupdocslimit (int groupdocslimit) Specifies the number of documents returned per group, and returns a document by default when not specified
    • Groupingsearch:setgroupsort (sort groupsort) specify group sorting

Sample code:

1. First look at the code to build the index

public class Indexhelper {private document document;
  Private Directory directory;
 
  Private IndexWriter IndexWriter;
    Public Directory getdirectory () {directory= (directory==null)? New Ramdirectory ():d irectory;
  return directory;
  Private Indexwriterconfig GetConfig () {return new Indexwriterconfig (version.lucene_43, New Ikanalyzer (true));
    Private IndexWriter Getindexwriter () {try {return new IndexWriter (Getdirectory (), GetConfig ());
      catch (IOException e) {e.printstacktrace ();
    return null; } public Indexsearcher Getindexsearcher () throws IOException {return new Indexsearcher (Directoryreader.open (g
  Etdirectory ())); /** * Create index for Group Test * @param author * @param content/public void Createindexforgroup (
    int id,string author,string content) {IndexWriter = Getindexwriter ();
    Document = new document ();
    Document.add (New Intfield ("id", ID, Field.Store.YES)); DocumEnt.add (New Stringfield ("Author", author, Field.Store.YES));
    Document.add (New TextField ("content", content, Field.Store.YES));
      try {indexwriter.adddocument (document);
      Indexwriter.commit ();
    Indexwriter.close ();
    catch (IOException e) {e.printstacktrace ();
 }
  }
}

2. Grouping:

public class Grouptest public void Group (Indexsearcher indexsearcher,string groupfield,string content) throws Ioexceptio
    N, parseexception {groupingsearch groupingsearch = new Groupingsearch (GroupField);
    Groupingsearch.setgroupsort (New Sort (Sortfield.field_score));
    Groupingsearch.setfillsortfields (TRUE);
    GROUPINGSEARCH.SETCACHINGINMB (4.0, True);
    Groupingsearch.setallgroups (TRUE);
    Groupingsearch.setallgroupheads (TRUE);
 
    Groupingsearch.setgroupdocslimit (10);
    Queryparser parser = new Queryparser (version.lucene_43, "content", new Ikanalyzer (true));
 
    Query query = parser.parse (content);
 
    topgroups<bytesref> result = Groupingsearch.search (indexsearcher, query, 0, 1000);
    System.out.println ("Number of search Hits:" + Result.totalhitcount);
 
    System.out.println ("Group of Search results:" + result.groups.length);
    Document document; for (groupdocs<bytesref> groupDocs:result.groups) {System.out.println ("group:" + groupDocs.groupValue.utf8ToSt Ring ());
 
      System.out.println ("Group records:" + groupdocs.totalhits);
      System.out.println ("groupDocs.scoreDocs.length:" + groupDocs.scoreDocs.length);
      for (Scoredoc ScoreDoc:groupDocs.scoreDocs) {System.out.println (Indexsearcher.doc (Scoredoc.doc));

 }
    }
  }

3. Simple test:

public static void Main (string[] args) throws IOException, parseexception {indexhelper
    indexhelper = new Indexhelper ( );
    Indexhelper.createindexforgroup (1, "Sweet potato", "open source China");
    Indexhelper.createindexforgroup (2, "Sweet potato", "open source community");
    Indexhelper.createindexforgroup (3, "Sweet Potato", "code Design");
    Indexhelper.createindexforgroup (4, "Sweet Potato", "design");
    Indexhelper.createindexforgroup (5, "Sleep First", "Lucene development");
    Indexhelper.createindexforgroup (6, "Sleep First", "Lucene Combat");
    Indexhelper.createindexforgroup (7, "Sleep First", "Open source Lucene");
    Indexhelper.createindexforgroup (8, "Sleep First", "Open source Solr");
 
    Indexhelper.createindexforgroup (9, "scattered cents", "scattered xian open source Lucene");
    Indexhelper.createindexforgroup (10, "scattered cents", "scattered xian open source Solr");
    Indexhelper.createindexforgroup (11, "scattered cents", "open Source");
    Grouptest grouptest = new Grouptest ();
 
    Grouptest.group (Indexhelper.getindexsearcher (), "Author", "Open Source");
  }

4. Test results:

Two ways to page pagination
Lucene has two ways of paging:

1, directly to the search results paging, the amount of data can be used in this way, pagination code core reference:


scoredoc[] sd = XXX;
Query start record position
int begin = PageSize * (currentPage-1);
Query terminate record position
int end = Math.min (begin + PageSize, sd.length);
for (int i = begin, I < end && I <totalHits; i++) {
//code to process the search result data
}

2. Use Searchafter (...)

Lucene provides five overloaded methods that you can use to

Scoredoc after: For the last search results Scoredoc total minus 1;

Query queries: How To

int N: The number of results returned for each query, that is, the total result per page

A simple example of use:

You can use the MAP to save the necessary search results
map<string, object> resultmap = new hashmap<string, object> ();
Scoredoc after = null;
Query query = XX
topdocs td = Search.searchafter (after, query, size);
 
Gets the number of hits
resultmap.put ("num", td.totalhits);
 
scoredoc[] sd = Td.scoredocs;
for (Scoredoc scoredoc:sd) {/
/Classic search results processing
}
//Search results Scoredoc total minus 1 after
= sd[td.scoredocs.length-1]; 
//Save after for next search, that is, the next page begins 
Resultmap.put (after);
 
return resultmap;

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.