Step by step with me to learn lucene ()---the implementation principle of the custom sorting for Lucene search and writing your own custom sorting tool

Source: Internet
Author: User
Tags lenovo

Custom Sort Description

When we do lucene search, we may need to sort the function, although Lucene built up a number of types of sorting, but if you need to do some of the value of the operation and then in the sort of a little bit of powerlessness;

To make a custom query, we need to study the sort function that Lucene has implemented, all of Lucene's ordering is to inherit Fieldcomparator, and then rewrite the internal implementation, here Intcomparator as an example to see its implementation;

Intcomparator related implementations

The declaration of its class is public static class Intcomparator extends Numericcomparator<integer>, which shows that Intcomparator receives an Integer type parameter, That is, only the sorting of Intfield is processed;

The parameters of the Intcomparator declaration are:

Private final int[] values;    private int bottom;                           Value of bottom of the queue    private int topvalue;

Viewing the Copy method indicates

    • Values initialize the length of the class as it is initialized
    • Values are used to store what is read in Numericdocvalues

The specific implementation is as follows:

The initialization of values

/**      * Creates a new comparator based on {@link Integer#compare} for {@code numhits}.     * When a document had no value for the field, {@code missingvalue} is substituted.      */Public    intcomparator (int numhits, String field, Integer missingvalue) {      super (field, missingvalue);      values = new Int[numhits];    }

Values padding (this is how intcomparator is handled)

@Override public    void copy (int slot, int doc) {      int v2 = (int) currentreadervalues.get (doc);      Test for v2 = = 0 To save Bits.get method call for      //The common case (Doc have value and value is Non-zero):      if (Docswithfield! = NULL && v2 = = 0 &&!docswithfield.get (DOC)) {        v2 = missingvalue;      }      Values[slot] = v2;    }


These implementations are similar, and our application implementation of custom sorting needs to do is to calculate the value of Binarydocvalues or numericdocvalues, and then implement the Fieldcomparator internal method, The corresponding intcomparator is the value copy operation as above;

Then we need to implement the Comparetop, Comparebottom, and Compare,intcomparator implementations as:

 @Override    public int compare (int slot1, int slot2) {return Integer.compare (VALUES[SLOT1], Values[slot2]);      } @Override public int comparebottom (int doc) {int v2 = (int) currentreadervalues.get (DOC); Test for v2 = = 0 To save Bits.get method call for//The common case (Doc have value and value is Non-zero): I      F (Docswithfield! = NULL && v2 = = 0 &&!docswithfield.get (doc)) {v2 = Missingvalue;    } return Integer.compare (bottom, V2); }
@Override public int comparetop (int doc) {int docvalue = (int) currentreadervalues.get (DOC);      Test for docvalue = = 0 To save Bits.get method call for//The common case (Doc have value and value is Non-zero): if (Docswithfield! = null && Docvalue = = 0 &&!docswithfield.get (doc)) {docvalue = Missingval      Ue    } return Integer.compare (Topvalue, Docvalue); }
Realize your fieldcomparator.

To implement Fieldcomparator, you need to process the receive parameters, define a collection of processing values, define binarydocvalues and received parameters, and so on, here I write a generic comparator, the code is as follows:

Package Com.lucene.search;import Java.io.ioexception;import Org.apache.lucene.index.binarydocvalues;import Org.apache.lucene.index.docvalues;import Org.apache.lucene.index.leafreadercontext;import Org.apache.lucene.search.simplefieldcomparator;import com.lucene.util.objectutil;/** Custom Comparator * @author Lenovo * */public class Selfdefinecomparator extends simplefieldcomparator<string> {private object[] values;//defined Object [], with Intcomparatorprivate object Bottom;private object top;private String field;private binarydocvalues binarydocvalues ;//Received binarydocvalues, with Intcomparator in Numericdocvaluesprivate objectutil objectutil;//here in order to facilitate expansion with interface instead of abstract class private Object[] params;//received parameters public selfdefinecomparator (String field, int numhits, object[] params,objectutil objectutil) {V  Alues = new Object[numhits];this.objectutil = Objectutil;this.field = Field;this.params = params; } @Overridepublic void Setbottom (int slot) {this.bottom = Values[slot];} @Overridepublic int Comparebottom (int doc) throws IOException {Object distance = getValues (doc); return (Bottom.tostring ()). CompareTo (Distance.tostring ());} @Overridepublic int comparetop (int doc) throws IOException {Object distance = getValues (doc); return Objectutil.compareto (top,distance);} @Overridepublic void Copy (int slot, int doc) throws IOException {Values[slot] = getValues (DOC);} /**??? DocID????? Value * @param doc * @return */private object getValues (int doc) {Object instance = Objectutil.getvalues (doc,params,binary Docvalues); return instance;} @Overrideprotected void Dosetnextreader (Leafreadercontext context) throws IOException {binarydocvalues = Docvalues.getbinary (Context.reader (), field);//context.reader (). getbinarydocvalues (field); @Overridepublic int compare (int slot1, int slot2) {return Objectutil.compareto (Values[slot1],values[slot2]);} @Overridepublic void Settopvalue (String value) {this.top = value;} @Overridepublic String Value (int slot) {return values[slot].tostring ();}}

Where Objectutil is an interface that defines the process of value processing and ultimately serves the compare method of comparator, and defines comparator's internal compare method.

The Objectutil interface is defined as follows:

Package Com.lucene.util;import Org.apache.lucene.index.binarydocvalues;public Interface Objectutil {/** custom Get processing value method * @param doc  * @param params  * @param binarydocvalues  * @return */public abstract Object getValues (int doc, Ob Ject[] params, binarydocvalues binarydocvalues)/**compare Comparator Implementation * @param object * @param object2 * @return */public abstr Act int compareTo (Object object, Object Object2);}

We not only provide comparators and comparator, but also provide filedcomparatorsource to receive user input.

Package Com.lucene.search;import Java.io.ioexception;import Org.apache.lucene.search.fieldcomparator;import Org.apache.lucene.search.fieldcomparatorsource;import Com.lucene.util.objectutil;/**comparator is used to receive the user's original input, Inherit from Fieldcomparatorsource implement custom comparator build * @author Lenovo * */public class Selfdefinecomparatorsource extends Fieldcomparatorsource {private object[] params;//received parameters private Objectutil objectutil;//here in order to facilitate the expansion of the interface instead of the abstract class public object[ ] Getparams () {return params;} public void SetParams (object[] params) {this.params = params;} Public Objectutil Getobjectutil () {return objectutil;} public void Setobjectutil (Objectutil objectutil) {this.objectutil = Objectutil;} Public Selfdefinecomparatorsource (object[] params, Objectutil objectutil) {super (); this.params = params; This.objectutil = Objectutil;} @Overridepublic fieldcomparator<?> Newcomparator (String fieldname, int numhits,int Sortpos, Boolean reversed) Throws IOException {//Actual comparison is implemented by Selfdefinecomparator return new Selfdefinecomparator (fIeldname, Numhits, params, objectutil);}} 

Related test programs, where we simulate a stringcomparator, sort string values

Package Com.lucene.search;import Org.apache.lucene.analysis.analyzer;import Org.apache.lucene.analysis.standard.standardanalyzer;import Org.apache.lucene.document.BinaryDocValuesField; Import Org.apache.lucene.document.document;import Org.apache.lucene.document.field;import Org.apache.lucene.document.stringfield;import Org.apache.lucene.index.directoryreader;import Org.apache.lucene.index.indexreader;import Org.apache.lucene.index.indexwriter;import Org.apache.lucene.index.indexwriterconfig;import Org.apache.lucene.index.indexwriterconfig.openmode;import Org.apache.lucene.index.term;import Org.apache.lucene.search.indexsearcher;import Org.apache.lucene.search.matchalldocsquery;import Org.apache.lucene.search.query;import Org.apache.lucene.search.scoredoc;import Org.apache.lucene.search.sort;import Org.apache.lucene.search.SortField ; Import Org.apache.lucene.search.termquery;import Org.apache.lucene.search.topdocs;import Org.apache.lucene.search.topfielddocs;import Org.apache.lucene.store. Ramdirectory;import Org.apache.lucene.util.bytesref;import Com.lucene.util.customerutil;import Com.lucene.util.objectutil;import com.lucene.util.stringcomparautil;/** * * @author Wu Ying GUI * */public class SortTest {Publi  c static void Main (string[] args) throws Exception {ramdirectory directory = new Ramdirectory (); Analyzer Analyzer = new StandardAnalyzer (); Indexwriterconfig indexwriterconfig = new Indexwriterconfig (analyzer); Indexwriterconfig.setopenmode (openmode.create_or_append); IndexWriter indexwriter = new IndexWriter (directory,        Indexwriterconfig);          Adddocument (IndexWriter, "B");          Adddocument (IndexWriter, "D");          Adddocument (IndexWriter, "A");          Adddocument (IndexWriter, "E");        Indexwriter.commit ();          Indexwriter.close ();        Indexreader reader = directoryreader.open (directory);          Indexsearcher searcher = new Indexsearcher (reader);        Query query = new Matchalldocsquery ();      Objectutil util = new Stringcomparautil ();  Sort sort = new sort (new SortField ("name", New Selfdefinecomparatorsource (new Object[]{},util), true));        Topdocs Topdocs = searcher.search (query, Integer.max_value, sort);        scoredoc[] docs = topdocs.scoredocs;              for (Scoredoc doc:docs) {Document document = Searcher.doc (Doc.doc);        System.out.println (Document.get ("name")); }}private static void Adddocument (IndexWriter writer,string name) throws exception{Document document = new Docum          ENT ();          Document.add (New Stringfield ("name", Name,field.store.yes));          Document.add (New Binarydocvaluesfield ("name", New Bytesref (Name.getbytes ()));      Writer.adddocument (document); }  }

The corresponding Objectutil implementation is as follows:

Package Com.lucene.util;import Org.apache.lucene.index.binarydocvalues;import Org.apache.lucene.util.BytesRef; public class Stringcomparautil implements Objectutil {@Overridepublic Object getValues (int doc, object[] params, Binarydocvalues binarydocvalues) {bytesref bytesref = Binarydocvalues.get (DOC); String value = bytesref.utf8tostring (); return value;} @Overridepublic int CompareTo (Object object, Object Object2) {//TODO auto-generated method Stubreturn object.tostring (). CompareTo (Object2.tostring ());}}

The time is not early, today first written here, tomorrow on the relevant source download

Step by step with me to learn Lucene is a summary of the recent Lucene index, we have a question to contact my q-q: 891922381, at the same time I new Q-q group: 106570134 (Lucene,solr,netty,hadoop), such as Mongolia joined, Greatly appreciated, we discuss together, I strive for a daily Bo, I hope that we continue to pay attention, will bring you surprise










Step by step with me to learn lucene ()---the implementation principle of the custom sorting for Lucene search and writing your own custom sorting tool

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.