Lucene a simple implementation of a Termfilter

Source: Internet
Author: User
Tags bitset

Public abstract Docsenum Docs (Bits livedocs, docsenum reuse, int flags) throws IOException;

After a day of research, finally some progress.  I hope you all put forward a variety of views, welcome to shoot Bricks! Lucene version: 4.3.1

episode, originally wanted to write spetial search, but research and research, learned that termfilter. So don't be surprised when you see the code. If you have a chance, write some more about spetial. Although there is a ready-made implementation, but still want to understand, exactly what is going on. You are welcome to take some deep bricks.

You can shoot bricks, but you must make a point

Core class:

Package Com.pptv.search.list.index.increment;import Java.io.ioexception;import Java.nio.charset.charset;import Java.util.iterator;import Org.apache.commons.lang.numberutils;import Org.apache.lucene.index.atomicreader;import Org.apache.lucene.index.atomicreadercontext;import Org.apache.lucene.index.docsenum;import Org.apache.lucene.index.fields;import Org.apache.lucene.index.term;import Org.apache.lucene.index.terms;import Org.apache.lucene.index.termsenum;import Org.apache.lucene.search.docidset;import Org.apache.lucene.search.docidsetiterator;import Org.apache.lucene.search.filter;import Org.apache.lucene.util.bits;import Org.apache.lucene.util.fixedbitset;public class MyOwnFilter extends Filter { public static void Main (string[] args) throws Exception {Spatialsearchtest.main (args);} @Overridepublic docidset Getdocidset (atomicreadercontext context, Bits Acceptdocs) throws IOException {//lazy init if nee Ded-no need to create a big bitset ahead of TimeSystem.out.println (">>>> Myownfilter in "); final Atomicreader reader = Context.reader ();//A. Generate a result set and initialize to maximum fixedbitset result = new Fixedbitset ( Reader.maxdoc ());//B. By getting all the word elements final fields = Reader.fields ();//Display fieldsshowfields (fields);//Terms action string Termname = "Able"; Terms Terms = fields.terms (termname); System.out.println (Termname + "_" + "terms.size () =" + terms.size ());//C. Get specific each word element termsenum reuse = Null;reuse = Terms . iterator (reuse); for (int i = 0; i < terms.size (); i++) {reuse.next (); SYSTEM.OUT.PRINTLN ("----" + i + "----" + reuse.term ()); System.out.println ("Content:" + new String (reuse.term (). Bytes, 0, Reuse.term (). Length,charset.forname ("UTF-8"));// Bytesref Text = new Bytesref ("2". GetBytes ());//D. View all terms, whether there is this term//System.out.println (reuse.seekexact (text, false));//System.out.println (text);//E. Reverse-Check the inverted dictionary by word element docsenum docs = null;//No freq, since we don ' t need Themdocs = RE Use.docs (Acceptdocs, Docs, Docsenum.flag_none); while (Docs.nextdoc ()! = docidsetiterator.no_more_docs) {int DocId = Docs.docid (); System.out.println ("Collected:" + docId); Result.set (docId);}} System.out.println ("<<<< myownfilter out"); return result;} private void Showfields (Final fields fields) {System.out.println ("fields.size () =" + fields.size ()); iterator<string > ite = fields.iterator (); int i = 0;while (Ite.hasnext ()) {++i; System.out.println ("\ t" + i + ":" + Ite.next ());}}}


Entry class:

Package Com.pptv.search.list.index.increment;import Java.io.ioexception;import Java.util.bitset;import Java.util.set;import Org.apache.lucene.analysis.standard.standardanalyzer;import Org.apache.lucene.document.document;import Org.apache.lucene.document.doubledocvaluesfield;import Org.apache.lucene.document.doublefield;import Org.apache.lucene.document.field;import Org.apache.lucene.document.field.store;import Org.apache.lucene.document.fieldtype;import Org.apache.lucene.document.numericdocvaluesfield;import Org.apache.lucene.document.stringfield;import Org.apache.lucene.index.atomicreader;import Org.apache.lucene.index.atomicreadercontext;import Org.apache.lucene.index.directoryreader;import Org.apache.lucene.index.indexreader;import Org.apache.lucene.index.indexwriter;import Org.apache.lucene.index.indexwriterconfig;import Org.apache.lucene.index.term;import Org.apache.lucene.search.complexexplanation;import Org.apache.lucene.search.docidset;import org.apache.lucene.search.DocIdSetItErator;import Org.apache.lucene.search.explanation;import Org.apache.lucene.search.filter;import Org.apache.lucene.search.indexsearcher;import Org.apache.lucene.search.matchalldocsquery;import Org.apache.lucene.search.query;import Org.apache.lucene.search.scoredoc;import Org.apache.lucene.search.Scorer; Import Org.apache.lucene.search.topdocs;import Org.apache.lucene.search.weight;import Org.apache.lucene.store.ramdirectory;import Org.apache.lucene.util.bits;import Org.apache.lucene.util.BytesRef; Import Org.apache.lucene.util.docidbitset;import Org.apache.lucene.util.numericutils;import Org.apache.lucene.util.openbitsetiterator;import Org.apache.lucene.util.tostringutils;import org.apache.lucene.util.Version; @SuppressWarnings ("Unused") public class Spatialsearchtest {static version Version = Version.lucene_43;public static void Main (string[] args) throws Exception {ramdirectory d = new Ramdirectory (); indexwrite R writer = new IndexWriter (d, New Indexwriterconfig (Version,new StandardanalyzER (version)));d Oindex (writer), Indexsearcher searcher = new Indexsearcher (Directoryreader.open (d)); System.out.println ("Maxdoc:" + searcher.getindexreader (). Maxdoc ());//Query,filterquery Query = new Myquery (); Query.setboost (1.0001f); SYSTEM.OUT.PRINTLN ("query:" + query); Filter filter = Null;filter = Createfilter (); System.out.println ("Filter:" + filter); Topdocs TDS = searcher.search (query, filter, ten); for (int i = 0; i < tds.scoreDocs.length; i++) {Scoredoc sd = Tds.scor Edocs[i];D ocument doc = Searcher.doc (sd.doc);p Intdoc (DOC);}} private static filter Createfilter () {Filter filter;//filter = new Myfilter (New term ("Able", "1")); filter = new Myownfilt ER (); return filter;} private static void Pintdoc (Document doc) {String lat = doc.get ("lat"); String LNG = Doc.get ("LNG"); System.out.println ("(" + LNG + "," + Lat + ")");  private static void Doindex (IndexWriter writer) throws exception,ioexception {for (int i = 0; i < && i < 5; i++) {Document document = new document (); Indexlocation (document, 100l + I, (math.random () * 100l) + i * i,i% 2 = = 0? "0": "ABCD Hello"); writer.adddocument (document);} Writer.forcemerge (1); Writer.close ();} private static void Indexlocation (document document, double longitude,double latitude, String able) throws Exception {Doub Lefield lat = new Doublefield ("lat", Latitude, Store.yes);D Oublefield LNG = new Doublefield ("LNG", longitude, Store.yes);d Ocument.add (New Stringfield ("Able", able, store.yes));d ocument.add (LAT);d ocument.add (LNG);}}

is actually the following method of exposing us through filter

Public abstract Docidset Getdocidset (atomicreadercontext context, Bits Acceptdocs) throws IOException;

Through the context to get reader, and then get fields, and then get terms, finally through

Public abstract Docsenum Docs (Bits livedocs, docsenum reuse, int flags) throws IOException;

method to encapsulate the result and return it. Note that this process is performed during the search process.


The following sentence, I hope that the master to shoot the bricks,

At the moment, I guess Lucene is the first query, then filter, right? I don't feel right about it. I hope to express that there will be another chance to verify this problem.






Lucene a simple implementation of a Termfilter

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.