Lucene a simple implementation of a Termfilter

Last Update:2015-01-05 Source: Internet

Author: User

Tags bitset

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Public abstract Docsenum Docs (Bits livedocs, docsenum reuse, int flags) throws IOException;

After a day of research, finally some progress. I hope you all put forward a variety of views, welcome to shoot Bricks! Lucene version: 4.3.1

episode, originally wanted to write spetial search, but research and research, learned that termfilter. So don't be surprised when you see the code. If you have a chance, write some more about spetial. Although there is a ready-made implementation, but still want to understand, exactly what is going on. You are welcome to take some deep bricks.

You can shoot bricks, but you must make a point

Core class:

Package Com.pptv.search.list.index.increment;import Java.io.ioexception;import Java.nio.charset.charset;import Java.util.iterator;import Org.apache.commons.lang.numberutils;import Org.apache.lucene.index.atomicreader;import Org.apache.lucene.index.atomicreadercontext;import Org.apache.lucene.index.docsenum;import Org.apache.lucene.index.fields;import Org.apache.lucene.index.term;import Org.apache.lucene.index.terms;import Org.apache.lucene.index.termsenum;import Org.apache.lucene.search.docidset;import Org.apache.lucene.search.docidsetiterator;import Org.apache.lucene.search.filter;import Org.apache.lucene.util.bits;import Org.apache.lucene.util.fixedbitset;public class MyOwnFilter extends Filter { public static void Main (string[] args) throws Exception {Spatialsearchtest.main (args);} @Overridepublic docidset Getdocidset (atomicreadercontext context, Bits Acceptdocs) throws IOException {//lazy init if nee Ded-no need to create a big bitset ahead of TimeSystem.out.println (">>>> Myownfilter in "); final Atomicreader reader = Context.reader ();//A. Generate a result set and initialize to maximum fixedbitset result = new Fixedbitset ( Reader.maxdoc ());//B. By getting all the word elements final fields = Reader.fields ();//Display fieldsshowfields (fields);//Terms action string Termname = "Able"; Terms Terms = fields.terms (termname); System.out.println (Termname + "_" + "terms.size () =" + terms.size ());//C. Get specific each word element termsenum reuse = Null;reuse = Terms . iterator (reuse); for (int i = 0; i < terms.size (); i++) {reuse.next (); SYSTEM.OUT.PRINTLN ("----" + i + "----" + reuse.term ()); System.out.println ("Content:" + new String (reuse.term (). Bytes, 0, Reuse.term (). Length,charset.forname ("UTF-8"));// Bytesref Text = new Bytesref ("2". GetBytes ());//D. View all terms, whether there is this term//System.out.println (reuse.seekexact (text, false));//System.out.println (text);//E. Reverse-Check the inverted dictionary by word element docsenum docs = null;//No freq, since we don ' t need Themdocs = RE Use.docs (Acceptdocs, Docs, Docsenum.flag_none); while (Docs.nextdoc ()! = docidsetiterator.no_more_docs) {int DocId = Docs.docid (); System.out.println ("Collected:" + docId); Result.set (docId);}} System.out.println ("<<<< myownfilter out"); return result;} private void Showfields (Final fields fields) {System.out.println ("fields.size () =" + fields.size ()); iterator<string > ite = fields.iterator (); int i = 0;while (Ite.hasnext ()) {++i; System.out.println ("\ t" + i + ":" + Ite.next ());}}}

Entry class:

Package Com.pptv.search.list.index.increment;import Java.io.ioexception;import Java.util.bitset;import Java.util.set;import Org.apache.lucene.analysis.standard.standardanalyzer;import Org.apache.lucene.document.document;import Org.apache.lucene.document.doubledocvaluesfield;import Org.apache.lucene.document.doublefield;import Org.apache.lucene.document.field;import Org.apache.lucene.document.field.store;import Org.apache.lucene.document.fieldtype;import Org.apache.lucene.document.numericdocvaluesfield;import Org.apache.lucene.document.stringfield;import Org.apache.lucene.index.atomicreader;import Org.apache.lucene.index.atomicreadercontext;import Org.apache.lucene.index.directoryreader;import Org.apache.lucene.index.indexreader;import Org.apache.lucene.index.indexwriter;import Org.apache.lucene.index.indexwriterconfig;import Org.apache.lucene.index.term;import Org.apache.lucene.search.complexexplanation;import Org.apache.lucene.search.docidset;import org.apache.lucene.search.DocIdSetItErator;import Org.apache.lucene.search.explanation;import Org.apache.lucene.search.filter;import Org.apache.lucene.search.indexsearcher;import Org.apache.lucene.search.matchalldocsquery;import Org.apache.lucene.search.query;import Org.apache.lucene.search.scoredoc;import Org.apache.lucene.search.Scorer; Import Org.apache.lucene.search.topdocs;import Org.apache.lucene.search.weight;import Org.apache.lucene.store.ramdirectory;import Org.apache.lucene.util.bits;import Org.apache.lucene.util.BytesRef; Import Org.apache.lucene.util.docidbitset;import Org.apache.lucene.util.numericutils;import Org.apache.lucene.util.openbitsetiterator;import Org.apache.lucene.util.tostringutils;import org.apache.lucene.util.Version; @SuppressWarnings ("Unused") public class Spatialsearchtest {static version Version = Version.lucene_43;public static void Main (string[] args) throws Exception {ramdirectory d = new Ramdirectory (); indexwrite R writer = new IndexWriter (d, New Indexwriterconfig (Version,new StandardanalyzER (version)));d Oindex (writer), Indexsearcher searcher = new Indexsearcher (Directoryreader.open (d)); System.out.println ("Maxdoc:" + searcher.getindexreader (). Maxdoc ());//Query,filterquery Query = new Myquery (); Query.setboost (1.0001f); SYSTEM.OUT.PRINTLN ("query:" + query); Filter filter = Null;filter = Createfilter (); System.out.println ("Filter:" + filter); Topdocs TDS = searcher.search (query, filter, ten); for (int i = 0; i < tds.scoreDocs.length; i++) {Scoredoc sd = Tds.scor Edocs[i];D ocument doc = Searcher.doc (sd.doc);p Intdoc (DOC);}} private static filter Createfilter () {Filter filter;//filter = new Myfilter (New term ("Able", "1")); filter = new Myownfilt ER (); return filter;} private static void Pintdoc (Document doc) {String lat = doc.get ("lat"); String LNG = Doc.get ("LNG"); System.out.println ("(" + LNG + "," + Lat + ")");  private static void Doindex (IndexWriter writer) throws exception,ioexception {for (int i = 0; i < && i < 5; i++) {Document document = new document (); Indexlocation (document, 100l + I, (math.random () * 100l) + i * i,i% 2 = = 0? "0": "ABCD Hello"); writer.adddocument (document);} Writer.forcemerge (1); Writer.close ();} private static void Indexlocation (document document, double longitude,double latitude, String able) throws Exception {Doub Lefield lat = new Doublefield ("lat", Latitude, Store.yes);D Oublefield LNG = new Doublefield ("LNG", longitude, Store.yes);d Ocument.add (New Stringfield ("Able", able, store.yes));d ocument.add (LAT);d ocument.add (LNG);}}

is actually the following method of exposing us through filter

Public abstract Docidset Getdocidset (atomicreadercontext context, Bits Acceptdocs) throws IOException;

Through the context to get reader, and then get fields, and then get terms, finally through

Public abstract Docsenum Docs (Bits livedocs, docsenum reuse, int flags) throws IOException;

method to encapsulate the result and return it. Note that this process is performed during the search process.

The following sentence, I hope that the master to shoot the bricks,

At the moment, I guess Lucene is the first query, then filter, right? I don't feel right about it. I hope to express that there will be another chance to verify this problem.

Lucene a simple implementation of a Termfilter

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More