Lucene3.5 Custom scoring and custom scoring settings based on a field

Source: Internet
Author: User

first, to review the Lucene custom scoring steps:
1. Create a scoring field
Fieldscorequery fd = new Fieldscorequery ("Score", Type.int);
2. Create a custom query object based on the scoring field and the original query
Mycustomscorequery query = new Mycustomscorequery (q, FD);
@SuppressWarnings ("Serial") private class Mycustomscorequery extends Customscorequery {public mycustomscorequery ( Query subquery, Valuesourcequery valsrcquery) {super (subquery, valsrcquery);} @Overrideprotected customscoreprovider Getcustomscoreprovider (Indexreader reader) throws IOException {// The default implementation rating is based on the score obtained by the original Score * incoming scoring field to determine the final score//in order to score according to different needs, you need to set your own rating/** * Steps to customize a rating * Create a class that inherits from Customscoreprovider * Overwrite Customscore method *///return super.getcustomscoreprovider (reader); return new Mycustomscoreprovider (reader);}}
3. Create a class that inherits from the Customscoreprovider, overriding the Customscore method
Private class Mycustomscoreprovider extends Customscoreprovider {public mycustomscoreprovider (Indexreader reader) { Super (reader);}        /**         * Subqueryscore indicates that the default document is scored         * Valsrcscore indicates the scoring field score *         /@Overridepublic float customscore (int doc, float Subqueryscore, float Valsrcscore) throws IOException {//return Super.customscore (doc, Subqueryscore, Valsrcscore); return subqueryscore/valsrcscore;}}

Ii. custom scoring settings based on the domain1, according to the file suffix name for custom ratings
Private class Filenamescorequery extends Customscorequery {public filenamescorequery (Query subquery) {super (subquery);} @Overrideprotected customscoreprovider Getcustomscoreprovider (Indexreader reader) throws IOException {//return Super.getcustomscoreprovider (reader); return new Filenamescoreprovider (reader);}} Private class Filenamescoreprovider extends Customscoreprovider {String [] filenames = Null;public Filenamescoreprovider (Indexreader Reader) {super (reader); try {filenames = FieldCache.DEFAULT.getStrings (reader, "filename")} catch (IOException e) { E.printstacktrace ();}} @Overridepublic float Customscore (int doc, float subqueryscore, float valsrcscore) throws IOException {// How to get the value of the corresponding field according to Doc/** * before reader is closed, all data is stored in a cache domain and can be cached to obtain a lot of useful information * filenames = FieldCache.DEFAULT.getStrings ( Reader, "filename"); You can get information about all the filename fields */string filename = filenames[doc];if (Filename.endswith (". txt") | | Filename.endswith (". ini")) {return subqueryscore*1.5f;} Return Super.customscore (Doc, SUBQUERYSCOre, Valsrcscore); return subqueryscore/1.5f;}} 
2. Custom Scoring by date
Private class Datescoreprovider extends Customscoreprovider {long[] dates = Null;public Datescoreprovider (indexreader Reader) {super (reader); try {dates = FieldCache.DEFAULT.getLongs (reader, "date");} catch (IOException e) { E.printstacktrace ();}} @Overridepublic float Customscore (int doc, float subqueryscore, float valsrcscore) throws IOException {Long date = Dates[do C];long today = new Date (). GetTime (); Long year = 1000*60*60*365;if (today-date <= year) {//for which it is added}return Super.customsco Re (doc, Subqueryscore, Valsrcscore);}}

the key ideas for Lucene to achieve a custom score:Indexsearch.search to pass in a customscorequery, to overwrite the Getcustomscoreprovider method, and to return the Customscoreprovider object, In an anonymous internal way to write a Customscoreprovider overlay Customscore method, this method has 3 parameters, the first parameter represents the document ID, the second parameter represents the original score, the last one represents the scoring field we set, Then we can define our own set of scoring algorithms to score our search.
The complete code is as follows:1, Tool class:
Package Com.dhb.util;import Java.io.file;import Java.io.filereader;import java.io.ioexception;import Java.util.random;import Org.apache.lucene.analysis.standard.standardanalyzer;import Org.apache.lucene.document.document;import Org.apache.lucene.document.field;import Org.apache.lucene.document.numericfield;import Org.apache.lucene.index.corruptindexexception;import Org.apache.lucene.index.indexwriter;import Org.apache.lucene.index.indexwriterconfig;import Org.apache.lucene.store.directory;import Org.apache.lucene.store.fsdirectory;import Org.apache.lucene.store.lockobtainfailedexception;import Org.apache.lucene.util.version;public Class fileindexutils {private static Directory directory = null;static {try {directory = Fsdirectory.open (new File ("D:/luceneda Ta/files/"));} catch (IOException e) {e.printstacktrace ();}} public static directory Getdirectory () {return directory;} public static void Index (Boolean hasnew) {IndexWriter writer = null;try {indexwriterconfig IWC = new indexwriterconfIG (version.lucene_35, New StandardAnalyzer (version.lucene_35)), writer = new IndexWriter (directory, IWC);//Whether to create a new index if ( Hasnew) {Writer.deleteall ();} Document doc = null; File F = new file ("D:/lucenedata/example");  Random rand = new Random (), int index = 0;for (File file:f.listfiles ()) {int score = Rand.nextint (600); Test the custom score with doc = new Document ();//test custom filter with the Doc.add ("id", string.valueof (index++), Field.Store.YES, Field.Index.NOT_ANALYZED_NO_NORMS));d Oc.add (new field ("Content", new FileReader (file));d Oc.add (new Field (" FileName ", File.getname (), Field.Store.YES, Field.Index.NOT_ANALYZED));d Oc.add (New Field (" Path ", File.getabsolutepath (), field.store.yes,field.index.not_analyzed));d Oc.add (New Numericfield ("date", Field.Store.YES, True). Setlongvalue (File.lastmodified ()));d Oc.add (New Numericfield ("size", Field.Store.YES, True). Setintvalue ((int) (File.length ())));d Oc.add (New Numericfield ("Score", Field.Store.YES, True). Setintvalue (score)); Writer.adddocument (DOC);}} catch (CorruptindexException e) {e.printstacktrace ()} catch (Lockobtainfailedexception e) {e.printstacktrace ();} catch (IOException e) { E.printstacktrace ();} finally {if (writer!=null) try {writer.close ();} catch (Corruptindexexception e) {e.printstacktrace ();} catch ( IOException e) {e.printstacktrace ();}}}}
Note: (Build the index yourself, I do not call here, because put in another place, no post)2. Custom Classes
Package Com.dhb.util;import Java.io.ioexception;import Java.text.simpledateformat;import java.util.Date;import Org.apache.lucene.document.document;import Org.apache.lucene.index.corruptindexexception;import Org.apache.lucene.index.indexreader;import Org.apache.lucene.index.term;import Org.apache.lucene.search.fieldcache;import Org.apache.lucene.search.indexsearcher;import Org.apache.lucene.search.query;import Org.apache.lucene.search.scoredoc;import Org.apache.lucene.search.termquery;import Org.apache.lucene.search.topdocs;import Org.apache.lucene.search.function.customscoreprovider;import Org.apache.lucene.search.function.CustomScoreQuery ; Import Org.apache.lucene.search.function.fieldscorequery;import Org.apache.lucene.search.function.fieldscorequery.type;import Org.apache.lucene.search.function.valuesourcequery;public class Myscorequery {public void Searchbyscorequery () {try {    Indexsearcher searcher = new Indexsearcher (Indexreader.open (Fileindexutils.getdirectory ())); Query q = new Termquery (New term ("content", "Java"));    1. Create a scoring domain fieldscorequery fd = new Fieldscorequery ("Score", Type.int);        2. Create a custom query object based on the scoring field and the original query mycustomscorequery query = new Mycustomscorequery (q, FD);    Topdocs TDS = null;        TDS = searcher.search (query, 100); SimpleDateFormat SDF = new SimpleDateFormat ("Yyyy-mm-dd HH:mm:ss"); for (Scoredoc Sd:tds.scoreDocs) {Document d = Searche R.doc (Sd.doc); System.out.println (Sd.doc + ":(" + Sd.score + ") [" + D.get ("filename") + "" "+ d.get (" path ") +"---"+ d.get (" size ") +"---        -"+ Sdf.format (long.valueof (D.get (" date ")) +" Custom score: "+d.get (" score "));} Searcher.close ();} catch (Corruptindexexception e) {e.printstacktrace ();} catch (IOException e) {e.printstacktrace ()}} public void Searchbyfilescorequery () {try {Indexsearcher searcher = new Indexsearcher (Indexreader.open (    Fileindexutils.getdirectory ()));    Query q = new Termquery (New term ("content", "Java")); 1. Create a scoring field//fieldscorequery fd = new Fieldscorequery ("sCore ", Type.int);        Filenamescorequery query = new Filenamescorequery (q);        2. Create a custom query object based on the scoring field and the original query//mycustomscorequery query = new Mycustomscorequery (q, FD);    Topdocs TDS = null;        TDS = searcher.search (query, 100); SimpleDateFormat SDF = new SimpleDateFormat ("Yyyy-mm-dd HH:mm:ss"); for (Scoredoc Sd:tds.scoreDocs) {Document d = Searche R.doc (Sd.doc); System.out.println (Sd.doc + ":(" + Sd.score + ") [" + D.get ("filename") + "" "+ d.get (" path ") +"---"+ d.get (" size ") +"---        -"+ Sdf.format (long.valueof (D.get (" date ")) +" Custom score: "+d.get (" score "));} Searcher.close ();} catch (Corruptindexexception e) {e.printstacktrace ();} catch (IOException e) {e.printstacktrace ()}} @SuppressWarnings ("Serial") private class Mycustomscorequery extends Customscorequery {public mycustomscorequery ( Query subquery, Valuesourcequery valsrcquery) {super (subquery, valsrcquery);} @Overrideprotected customscoreprovider Getcustomscoreprovider (Indexreader reader) throws IOException {//defaultThe score that is achieved by the original score * To determine the final score by the score field scored by the previous rating *//in order to score according to different needs, you need to set your own rating/** * Steps to customize a rating * Create a class that inherits from Customscoreprovider * Overwrite Customscore method *///return super.getcustomscoreprovider (reader); return new Mycustomscoreprovider (reader);}} Private class Mycustomscoreprovider extends Customscoreprovider {public mycustomscoreprovider (Indexreader reader) {        Super (reader);} /** * Subqueryscore indicates that the default document is scored * Valsrcscore indicates the scoring field score */@Overridepublic float customscore (int doc, FL Oat Subqueryscore, float valsrcscore) throws IOException {//return Super.customscore (doc, Subqueryscore, Valsrcscore); return subqueryscore/valsrcscore;}} @SuppressWarnings ("Serial") private class Filenamescorequery extends Customscorequery {public filenamescorequery ( Query subquery) {super (subquery);} @Overrideprotected customscoreprovider Getcustomscoreprovider (Indexreader reader) throws IOException {//return Super.getcustomscoreprovider (reader); return new Filenamescoreprovider (reader);}} Private Class Filenamescoreprovider Extends Customscoreprovider {String [] filenames = Null;public Filenamescoreprovider (Indexreader reader) {super (reader) try {filenames = FieldCache.DEFAULT.getStrings (reader, "filename"),} catch (IOException e) {e.printstacktrace ();}} @Overridepublic float Customscore (int doc, float subqueryscore, float valsrcscore) throws IOException {// How to get the value of the corresponding field according to Doc/** * before reader is closed, all data is stored in a cache domain and can be cached to obtain a lot of useful information * filenames = FieldCache.DEFAULT.getStrings ( Reader, "filename"); You can get information about all the filename fields */string filename = filenames[doc];if (Filename.endswith (". txt") | | Filename.endswith (". ini")) {return subqueryscore*1.5f;} Return Super.customscore (Doc, Subqueryscore, Valsrcscore); return subqueryscore/1.5f;}} @SuppressWarnings ("unused") private class Datescoreprovider extends Customscoreprovider {long[] dates = Null;public Datescoreprovider (Indexreader Reader) {super (reader); try {dates = FieldCache.DEFAULT.getLongs (reader, "date");} catch (IOException e) {E.printstacktrace ();}} @Overridepublic Float CustOmscore (int doc, float subqueryscore, float valsrcscore) throws IOException {Long date = Dates[doc];long today = new Date () . GetTime (); Long year = 1000*60*60*365;if (today-date <= year) {//for which it is added}return Super.customscore (Doc, Subqueryscore, VA Lsrcscore);}}}
3. Test class
Package Com.dhb.test;import Org.junit.test;import Com.dhb.util.myscorequery;public class Testcustomscore {@ testpublic void test01 () {myscorequery msq = new Myscorequery (); Msq.searchbyscorequery ();} @Testpublic void test02 () {myscorequery msq = new Myscorequery (); Msq.searchbyfilescorequery ();}}








Lucene3.5 Custom scoring and custom scoring settings based on a field

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.