Yesterday learned about the contents of the spell in the suggest package, mainly the spelling checker and similarity query hints;
Today is ready to understand the content of the Lenovo Word, Lucene's associative words are under the Org.apache.lucene.search.suggest package, providing automatic completion or association hints function support;
Inputiterator description
Inputiterator is an interface that supports enumeration of Term,weight,payload triples for suggester use, and currently supports only Analyzingsuggester, FuzzySuggester
and AnalyzingInfixSuggester
Three kinds of suggester support payloads;
There are several implementation classes for Inputiterator:
Bufferedinputiterator: Polling the input of the binary type;
Documentinputiterator: Polling in the field of the store from the index;
Fileiterator: Every time a single row of data polling is read from a file, the interval is at \ t (and the maximum number of \ t is 2);
Highfrequencyiterator: From the index in the store field polling, ignoring the length of the text is less than the set value;
Inputiteratorwrapper: Traverse Bytesrefiterator and the returned content does not contain payload and weight are 1;
Sortedinputiterator: Input polling of binary type and sorting according to the specified comparator algorithm;
Inputiterator provides the following methods:
Weight (): This method sets the weight of a term, the higher the suggest the higher the priority;
Payload (): each suggestion corresponding to the binary representation of the metadata, we need to transfer the object or object to convert a property of the Bytesref type, the corresponding Suggester call lookup will return payloads information;
Haspayload (): Judge iterator whether there is payloads;
Contexts (): Gets the contexts of a term that is used to filter the contents of suggest, and returns null if the Suggest list is empty
Hascontexts (): Get iterator whether there is contexts;
suggester Query Tool lookupClass description
This class provides the associative query function of a string
The lookup class provides a charsequencecomparator, which is primarily used to sort the charsequence, sorted by character order;
The built-in Lookupresult is used to return the results of suggest and is also sorted by key charsequencecomparator;
Built-in lookuppriorityqueue for storing lookupresult;
Methods provided by lookup
Build (Dictionary dict): Build from the specified directory;
Load (InputStream input): Turns InputStream into datainput and executes the load (Datainput) method;
Store (outputstream output): Turns OutputStream into DataOutput and executes the store (DataOutput) method;
GetCount (): Gets the number of entries for the build of lookup;
Build (Inputiterator Inputiterator): Constructs the Lookup object according to the specified inputiterator;
Lookup (Charsequence key, boolean onlymorepopular, int num): The possible results from the key query are returned with a value of list<lookupresult>;
The related implementations of lookup are as follows:
Write your own suggest module
Note: In suggest we need to import Lucene-misc-5.1.0.jar otherwise the system will prompt class Sortedmergepolicy not found;
First we define our own entity classes:
package Com.lucene.suggest;import Java.io.serializable;public class Product implements Serializable {private static final long Serialversionuid = 1l;private string Name;private string image;private string[] regions;private int numbersold;public Pro Duct (string name, string image, string[] regions, int numbersold) {this.name = Name;this.image = Image;this.regions = Regi Ons;this.numbersold = Numbersold;} Public String GetName () {return name;} public void SetName (String name) {this.name = name;} Public String GetImage () {return image;} public void SetImage (String image) {this.image = image;} Public string[] Getregions () {return regions;} public void Setregions (string[] regions) {this.regions = regions;} public int Getnumbersold () {return numbersold;} public void Setnumbersold (int numbersold) {this.numbersold = Numbersold;}}
Then define inputiterator here to define the consumer is list<object>, and the list is traversed into the payload:
Package Com.lucene.suggest;import Java.io.bytearrayoutputstream;import Java.io.ioexception;import Java.io.objectoutputstream;import Java.io.unsupportedencodingexception;import Java.util.Comparator;import Java.util.hashset;import Java.util.iterator;import Java.util.set;import Org.apache.lucene.search.suggest.inputiterator;import Org.apache.lucene.util.bytesref;public Class Productiterator implements Inputiterator {private iterator<product> productiterator; Private Product currentproduct; Productiterator (iterator<product> productiterator) {this.productiterator = Productiterator; } public boolean hascontexts () {return true; }/** * Whether there is set payload information */public boolean haspayloads () {return true; } public comparator<bytesref> Getcomparator () {return null; Public Bytesref Next () {if (Productiterator.hasnext ()) {currentproduct = Productiterator.next (); try {return new Bytesref (Currentproduct.getname (). GetBytes ("UTF8")); } catch (Unsupportedencodingexception e) {throw new RuntimeException ("couldn ' t convert to UTF-8", e); }} else {return null; }} public Bytesref payload () {try {bytearrayoutputstream bos = new Bytearrayoutputstream (); ObjectOutputStream out = new ObjectOutputStream (BOS); Out.writeobject (currentproduct); Out.close (); return new Bytesref (Bos.tobytearray ()); } catch (IOException e) {throw new RuntimeException ("Well that ' s unfortunate."); }} public set<bytesref> contexts () {try {set<bytesref> regions = new Hashset<byt Esref> (); For (String region:currentProduct.getRegions ()) {Regions.add (New Bytesref (Region.getbytes ("UTF8")); } return regions; } catch (UnsupportedencodiNgexception e) {throw new RuntimeException ("couldn ' t convert to UTF-8"); }} public long weight () {return currentproduct.getnumbersold (); }}
Writing test Classes
Package Com.lucene.suggest;import Java.io.bytearrayinputstream;import Java.io.ioexception;import Java.io.objectinputstream;import Java.nio.file.paths;import Java.util.arraylist;import Java.util.HashSet;import Java.util.list;import Org.apache.lucene.analysis.standard.standardanalyzer;import Org.apache.lucene.search.suggest.lookup.lookupresult;import Org.apache.lucene.search.suggest.analyzing.analyzinginfixsuggester;import org.apache.lucene.store.Directory; Import Org.apache.lucene.store.fsdirectory;import Org.apache.lucene.util.bytesref;public class SuggestProducts { private static void lookup (Analyzinginfixsuggester suggester, String name,string region) throws IOException {hashset< bytesref> contexts = new hashset<bytesref> () Contexts.add (New Bytesref (Region.getbytes ("UTF8")); list<lookupresult> results = suggester.lookup (name, contexts, 2, true, false); System.out.println ("--\" "+ name +" \ "(" + Region + "):"), for (Lookupresult result:results) {System.out.println (resUlt.key); Bytesref Bytesref = Result.payload;objectinputstream is = new ObjectInputStream (New Bytearrayinputstream ( Bytesref.bytes)); Product Product = null;try {Product = (product) is.readobject ();} catch (ClassNotFoundException e) {//TODO auto-generated Catch Blocke.printstacktrace ();} System.out.println ("Product-name:" + product.getname ()); System.out.println ("product-regions:" + product.getregions ()); System.out.println ("Product-image:" + product.getimage ()); System.out.println ("Product-numbersold:" + product.getnumbersold ());} System.out.println ();} public static void Main (string[] args) {try {Directory Indexdir = Fsdirectory.open (Paths.get ("Suggestpath", new String[0] )); StandardAnalyzer Analyzer = new StandardAnalyzer (); Analyzinginfixsuggester suggester = new Analyzinginfixsuggester (Indexdir, analyzer); arraylist<product> products = new arraylist<product> ();p Roducts.add (New Product ("Electric Guitar", "http ://images.example/electric-guitar.jpg ", new string[] {" US "," CA "}, 100));Products.add (New Product ("Electric Train", "http://images.example/train.jpg", new string[] {"US", "CA"}, 100)); Products.add (New Product ("Acoustic Guitar", "http://images.example/acoustic-guitar.jpg", new string[] {"US", "ZA"}, 80 );p Roducts.add (New Product ("Guarana Soda", "http://images.example/soda.jpg", new string[] {"ZA", "IE"}, 130)); Suggester.build (New Productiterator (Products.iterator ())), lookup (Suggester, "Gu", "US"), Lookup (Suggester, "Gu", "ZA Lookup (Suggester, "Gui", "CA"), Lookup (Suggester, "Electric guit", "US"); Suggester.refresh ();} catch (IOException e) {System.err.println ("error!");}}}
The code will be released tomorrow.
Step by step with me to learn Lucene is a summary of the recent Lucene index, we have a question to contact my q-q: 891922381, at the same time I new Q-q group: 106570134 (Lucene,solr,netty,hadoop), such as Mongolia joined, Greatly appreciated, we discuss together, I strive for a daily Bo, I hope that we continue to pay attention, will bring you surprise
Step by step with me to learn Lucene (TEN)---the Suggest principle and application of the associative words hint of lucene search