Lucene implementation of custom ratings

Last Update:2018-07-26 Source: Internet

Author: User

Tags file size

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

1. Engineering Catalogue

2, custom score one, according to file size to score, the larger the file, the lower the weight

Package util;
Import java.io.IOException;
Import Org.apache.lucene.index.IndexReader;
Import Org.apache.lucene.index.Term;
Import Org.apache.lucene.search.IndexSearcher;
Import Org.apache.lucene.search.Query;
Import Org.apache.lucene.search.TermQuery;
Import Org.apache.lucene.search.TopDocs;
Import Org.apache.lucene.search.function.CustomScoreProvider;
Import Org.apache.lucene.search.function.CustomScoreQuery;
Import Org.apache.lucene.search.function.FieldScoreQuery;
Import Org.apache.lucene.search.function.ValueSourceQuery;

Import Org.apache.lucene.search.function.FieldScoreQuery.Type; public class myscorequery1{public void Searchbyscorequery () throws exception{Indexsearcher searcher = Docutil.getse
		Archer ();
		
		Query query = new Termquery (New term ("content", "Java")); 1, create a scoring field, if type is a string type, then Type.byte//The field must be numeric, and cannot use the Norms index, and each document in which the field can only be a token//unit, usually available Field.Index.not_ Analyzer_no_norms to create an index fieldscorequery fieldscorequery = new Fieldscorequery ("size", Type.int); 2. Create a custom query object based on the scoring field and the original query//query is the original query,fieldscorequery is specifically scored by the query mycustomscorequery customquery = new
		
		Mycustomscorequery (query, fieldscorequery);
		Topdocs Topdoc = Searcher.search (customquery, 100);
		Docutil.printdocument (Topdoc, searcher);
		
	Searcher.close (); } @SuppressWarnings ("Serial") private class Mycustomscorequery extends customscorequery{public mycustomscorequery (
		Query subquery, Valuesourcequery valsrcquery) {super (subquery, valsrcquery);
		 }/** * The reader here is for paragraph, meaning that if the index contains more than one segment, the method will be called multiple times during the search, emphasizing that this is important because it enables your scoring logic to effectively use segment reader to retrieve values from the domain cache * */@Override protected customscoreprovider getcustomscoreprovider (Indexreader reader) throws IOException {//default The score that is achieved is determined by the score obtained by the original Score * incoming scoring field to determine the final rating//in order to score according to the different needs of the set/** * self-scoring steps * Create a class that inherits from Customscoreprovi
			Der * Covers Customscore method *///return Super.getcustomscoreprovider (reader);
		return new Mycustomscoreprovider (reader);
		
		
	}} private class Mycustomscoreprovider extends customscoreprovider{public mycustomscoreprovider (Indexreader reader)
		{super (reader);
		/** * Subqueryscore indicates that the default document is scored * Valsrcscore the score field is scored * The default is Subqueryscore*valsrcscore return */@Override public float Customscore (int doc, float subqueryscore, float valsrcscore) throws IOException {System.out.println ("Doc:"
			+doc);
			System.out.println ("Subqueryscore:" +subqueryscore);
System.out.println ("Valsrcscore:" +valsrcscore);
			Return Super.customscore (Doc, Subqueryscore, Valsrcscore);
		return subqueryscore/valsrcscore; }
		
	}
}

3, according to a specific number of file names to score, the selected file name weight becomes larger

Package util;
Import java.io.IOException;
Import Org.apache.lucene.index.IndexReader;
Import Org.apache.lucene.index.Term;
Import Org.apache.lucene.search.FieldCache;
Import Org.apache.lucene.search.IndexSearcher;
Import Org.apache.lucene.search.Query;
Import Org.apache.lucene.search.TermQuery;
Import Org.apache.lucene.search.TopDocs;
Import Org.apache.lucene.search.function.CustomScoreProvider;
Import Org.apache.lucene.search.function.CustomScoreQuery; /** * The function of this class is to give a specific file name weighting, that is, the addition of scoring * can also be achieved when searching for books in the last one or two years published books to increase the weight * @author user */public class MyScoreQuery2 {public voi
		D Searchbyfilescorequery () throws exception{Indexsearcher searcher = Docutil.getsearcher ();
		
		Query query = new Termquery (New term ("content", "Java"));
		
		Filenamescorequery fieldscorequery = new Filenamescorequery (query);
		Topdocs Topdoc = Searcher.search (fieldscorequery, 100);
		Docutil.printdocument (Topdoc, searcher);
		
	Searcher.close (); } @SuppressWarnings ("Serial") Private class FilenamescorequeRY extends customscorequery{public filenamescorequery (Query subquery) {super (subquery); } @Override protected Customscoreprovider getcustomscoreprovider (Indexreader reader) throws IOException {//RE
			Turn super.getcustomscoreprovider (reader);
		return new Filenamescoreprovider (reader);
		}} Private class Filenamescoreprovider extends customscoreprovider{string[] filenames = null;
			Public Filenamescoreprovider (Indexreader reader) {super (reader);
			try {filenames = FieldCache.DEFAULT.getStrings (reader, "filename");
		} catch (IOException e) {e.printstacktrace ();} }//How to get the value of the corresponding field according to Doc/* * Before reader is closed, all data is stored in a domain cache and can be used to obtain a lot of useful * information through the domain cache filenames = FieldCache.DEFAULT.ge TStrings (reader, "filename"); * all the filename fields are available */@Override public float customscore (int doc, float Subquer
			Yscore, float Valsrcscore) throws IOException {String fileName = Filenames[doc];
System.out.println (doc+ ":" +filename); return SUPEr.customscore (Doc, Subqueryscore, Valsrcscore); if ("9.txt". Equals (fileName) | |
			"4.txt". Equals (FileName) {return subqueryscore*1.5f;
		} return subqueryscore/1.5f; }
		
	}
}

4. Test JUnit

Package test;
Import Org.junit.Test;
Import util. MyScoreQuery1;
Import util. MyScoreQuery2;

public class Testcustomscore {

	@Test public
	void test01 () throws Exception {
		MyScoreQuery1 msq = new myscorequ Ery1 ();
		Msq.searchbyscorequery ();
	}
	
	@Test public
	void test02 () throws Exception {
		MyScoreQuery2 msq = new MyScoreQuery2 ();
		Msq.searchbyfilescorequery ();
	}
}

5. Tool class for document operation

Package util;
Import Java.io.File;
Import java.io.IOException;
Import Java.text.SimpleDateFormat;
Import Java.util.Date;
Import org.apache.lucene.document.Document;
Import org.apache.lucene.index.CorruptIndexException;
Import Org.apache.lucene.index.IndexReader;
Import Org.apache.lucene.search.IndexSearcher;
Import Org.apache.lucene.search.ScoreDoc;
Import Org.apache.lucene.search.TopDocs;
Import Org.apache.lucene.store.Directory;

Import Org.apache.lucene.store.FSDirectory;
	public class Docutil {private static Indexreader reader; Get Indexsearch object public static Indexsearcher Getsearcher () {try {directory directory = Fsdirectory.open (new File ("
			D:\\workspaces\\customscore\\index "));
		Reader = Indexreader.open (directory);
		} catch (Corruptindexexception e) {e.printstacktrace ();
		} catch (IOException e) {e.printstacktrace ();
		} Indexsearcher searcher = new Indexsearcher (reader);
	Return searcher; /** * Print Document INFORMATION * @param topdoc */public static void PRIntdocument (Topdocs Topdoc,indexsearcher searcher) {SimpleDateFormat SDF = new SimpleDateFormat ("Yyyy-mm-dd hh:mm:ss")
		;
				for (Scoredoc Scoredoc:topdoc.scoreDocs) {try {Document doc = Searcher.doc (scoredoc.doc); System.out.println (scoredoc.doc+ ":(" +scoredoc.score+ ")" + "[" +doc.get ("filename") + "" "+doc.get (" path ") +"---> "
			+ doc.get ("size") + "-----" +sdf.format (New Date (Long.valueof (Doc.get ("date"))) + "]");
			} catch (Corruptindexexception e) {e.printstacktrace ();
			} catch (IOException e) {e.printstacktrace (); }
		}
	}
}

6. Create an index

Package index;
Import Java.io.File;
Import java.io.IOException;
Import Org.apache.commons.io.FileUtils;
Import Org.apache.lucene.analysis.Analyzer;
Import org.apache.lucene.document.Document;
Import Org.apache.lucene.document.Field;
Import Org.apache.lucene.document.NumericField;
Import org.apache.lucene.index.CorruptIndexException;
Import Org.apache.lucene.index.IndexWriter;
Import Org.apache.lucene.index.IndexWriterConfig;
Import Org.apache.lucene.store.Directory;
Import Org.apache.lucene.store.FSDirectory;
Import org.apache.lucene.store.LockObtainFailedException;
Import org.apache.lucene.util.Version;

Import Org.wltea.analyzer.lucene.IKAnalyzer;
	public class Fileindexutils {private static directory directory = NULL;
	private static Analyzer Analyzer = new Ikanalyzer ();
	public static void Main (string[] args) {index (TRUE);
		} static{try {directory = Fsdirectory.open (new File ("D:\\workspaces\\customscore\\index"));
		} catch (IOException e) {e.printstacktrace ();
}	} public static directory Getdirectory () {return directory;
		public static void Index (Boolean hasnew) {IndexWriter writer = null;
			try {writer = new IndexWriter (directory, new Indexwriterconfig (version.lucene_35, analyzer));
			if (hasnew) {writer.deleteall ();
			} File File = new file ("D:\\workspaces\\customscore\\resource");
			Document doc = null;
				For (File f:file.listfiles ()) {doc = new Document ();
				Doc.add (New Field ("Content", fileutils.readfiletostring (f), field.store.yes,field.index.analyzed));
				Doc.add (New Field ("FileName", F.getname (), field.store.yes,field.index.analyzed));
				Doc.add (New Field ("ClassID", "5312", field.store.yes,field.index.analyzed));
				Doc.add (New Field ("Path", F.getabsolutepath (), field.store.yes,field.index.analyzed));
				Doc.add (New Numericfield ("date", Field.store.yes,true). Setlongvalue (F.lastmodified ()));
				Doc.add (New Numericfield ("size", field.store.yes,true). Setintvalue ((int) (F.length ())); Writer.adddocument (DOC);
		}} catch (Corruptindexexception e) {e.printstacktrace ();
		} catch (Lockobtainfailedexception e) {e.printstacktrace ();
		} catch (IOException e) {e.printstacktrace ();
			} finally {try {if (writer!=null) writer.close ();
			} catch (Corruptindexexception e) {e.printstacktrace ();
			} catch (IOException e) {e.printstacktrace (); }
		}
	}
}

Project Download path: http://download.csdn.net/detail/wxwzy738/5320772

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More