First introduce the query and filter the difference and contact, in fact, query (various queries) and filtering (various filter) very similar between, can say as long as the use of query can be done, with the filter can also be done, they can be converted to each other, The biggest difference is that the result set returned with the filter does not have a scoring action, and the results returned with query are correlated scores, so when we have a business that has nothing to do with the scoring operation, it takes precedence over the filter operation to get better performance. In fact, this is the difference between the Q parameter and the FQ parameter in SOLR.
below, start to get to the point, before this, the scattered fairy still like the old cliché first to understand that Lucene has a general knowledge of the filter
now, let's look at the specifics of how this is implemented in the code, first look at our test data
Java code
- ID score bookname ename type price date
- 1 1 Ethereal Journey Pmzl novel 52.23 201005
- 2 1 Kingdoms sgyy novel 36.13 201207
- 3 1 Database Combat Sjksz technology 77.13 200811
- 4 1 Series Bible BCBD technology 100.3 200501
- 5 1 Workplace Relations ZCGXL career 36.59 200501
- 6 1 Healthy Living jksh life 20.47 200008
- 7 1 See the Essence KQBZ Society 10.37 201004
- 8 1 programming, Programming BCBC Society 10.37 201004
Core Code
Java code
- Use filter When the last true contains the boundary part, false when the boundary part is not included
- When the second-to-last is true, contains the query boundary, false when it does not contain
- Termrangefilter filter=new termrangefilter ("ename", New Bytesref ("H"), New Bytesref ("n"), True, true);
- Topdocs Topdocs=searcher.search (New Matchalldocsquery (), filter,10000);//default no Sort mode
Output Results
Java code
- 6 1 Healthy Living jksh life 20.47 200008
- 7 1 See the Essence KQBZ Society 10.37 201004
Core Code
Java code
- Numericrangefilter<double> Filter=numericrangefilter.newdoublerange ("Price", 10D, 40D, True, false);
- Topdocs Topdocs=searcher.search (New Matchalldocsquery (), filter,10000);//default no Sort mode
Output Results
Java code
- 2 1 Kingdoms sgyy novel 36.13 201207
- 5 1 Workplace Relations ZCGXL career 36.59 200501
- 6 1 Healthy Living jksh life 20.47 200008
- 7 1 See the Essence KQBZ Society 10.37 201004
- 8 1 programming, Programming BCBC Society 10.37 201004
Core Code
Java code
- Using Cache filtering
- Filter Filter=fieldcacherangefilter.newdoublerange ("Price", 20D, 50D, True, true);
- Topdocs Topdocs=searcher.search (New Matchalldocsquery (), filter,10000);//default no Sort mode
Output Results
Java code
- 2 1 Kingdoms sgyy Fiction 36.13 201207
- 5 1 Workplace Relations zcgxl Workplace 36.59 200501
- 6 1 Healthy Living jksh Life 20.47 200008
Core Code
Java code
- Cache domain filter for specific categories
- Filter filter=new fieldcachetermsfilter ("type", New string[]{"Technology", "Society"});
- Topdocs Topdocs=searcher.search (New Matchalldocsquery (), filter,10000);//default no Sort mode
Output Results
Java code
- 3 1 Database Combat Sjksz technology 77.13 200811
- 4 1 Series Bible BCBD technology 100.3 200501
- 7 1 See the Essence KQBZ Society 10.37 201004
- 8 1 programming, Programming BCBC Society 10.37 201004
Core Code
Java code
- Use the Querywrapperfilter class to wrap a query
- Querywrapperfilter filter=new Querywrapperfilter (New Termquery ("type", "technology"));
- Topdocs Topdocs=searcher.search (New Matchalldocsquery (), filter,10000);//default no Sort mode
Output Results
Java code
- 3 1 Database Combat Sjksz technology 77.13 200811
- 4 1 Series Bible BCBD technology 100.3 200501
Finally, I look at how to inherit the filter base class, to customize our own filter, the custom filter, although at some point, the function is very powerful and flexible, but there are a few shortcomings, we know 1, the guarantee is the content of non-repeating fields, such as the primary key, if repeated, Default returns the first as a result set showing 2, which guarantees that the content cannot be participle, and if the field is a word breaker, some incorrect results may occur.
Custom Filter Class
Java code
- Package com.sanjiesanxian.test;
- Import java.io.IOException;
- Import Java.util.BitSet;
- Import Org.apache.lucene.analysis.tokenattributes.CharTermAttribute;
- Import Org.apache.lucene.index.AtomicReaderContext;
- Import Org.apache.lucene.index.DocsEnum;
- Import Org.apache.lucene.index.Term;
- Import Org.apache.lucene.search.DocIdSet;
- Import Org.apache.lucene.search.Filter;
- Import Org.apache.lucene.util.AttributeSource;
- Import org.apache.lucene.util.Bits;
- Import Org.apache.lucene.util.DocIdBitSet;
- Import Org.apache.lucene.util.FixedBitSet;
- Import Org.apache.lucene.util.OpenBitSet;
- /***
- *^_^ ^_^ ^_^
- * Custom Filters
- * @author
- * */
- public class Mycustomfilter extends filter{
- Public Mycustomfilter () {
- TODO auto-generated Constructor stub
- }
- Private string[] terms;//limit the returned data dictionary
- Public Mycustomfilter (String ... terms) {
- TODO auto-generated Constructor stub
- this.terms=terms;
- }
- @Override
- Public Docidset Getdocidset (Atomicreadercontext arg0, Bits arg1)
- Throws IOException {
- Fixedbitset bits=new Fixedbitset (Arg0.reader (). Maxdoc ());//Get not all docid including the non-deleted
- The relative cardinality of the int base=arg0.docbase;//segment, guaranteeing a correct position relative to multiple segments
- int Limit=base+arg0.reader (). Maxdoc ();//Calculate Maximum limit value
- for (String s:terms) {
- Docsenum Doc=arg0.reader (). Termdocsenum (New term ("id", s));//must be unique and not repeatable
- Guarantee is a single non-repeating term, if repeated, the default will take the first as the return result set, the term after the word also does not use the custom term
- if (Doc.nextdoc ()!=-1) {
- Bits.set (Doc.docid ());//To add a DocID loop that matches a conditional constraint to bits
- }
- }
- return bits;
- }
- }
test the query code
Java code
- Mycustomfilter filter=new Mycustomfilter ("3", "5", "2");//Optionally specify more than 1 items that need to be filtered
- Topdocs Topdocs=searcher.search (New Matchalldocsquery (), filter,10000);
Output Results
Java code
- 2 1 Kingdoms sgyy Fiction 36.13 201207
- 3 1 Database Combat sjksz Technology 77.13 200811
- 5 1 Workplace Relations zcgxl Workplace 36.59 200501
Although the custom filter has drawbacks, it can play a very flexible role in some scenarios, especially for fields that do not have a word breaker.
Lucene Custom Filters