In front of the word breaker, but we are in the search is not another effect is highlighted and a piece of text. So here we are to introduce highlighter.
Highlighter:
You can intercept a piece of text and have the keyword highlighted (by specifying a prefix and suffix, because it is displayed in a Web page, specifying <font color= ' Red ' ></font> will appear in red on the page).
FirstLucene03ByHighlighter.java:Java Code package com.iflytek.lucene; import java.io.file; import org.apache.lucene.analysis.analyzer; import org.apache.lucene.analysis.standard.standardanalyzer; import org.apache.lucene.document.document; import org.apache.lucene.index.indexreader; import org.apache.lucene.index.indexwriter; import org.apache.lucene.index.indexwriterconfig; import org.apache.lucene.queryparser.multifieldqueryparser; import org.apache.lucene.queryparser.queryparser; import org.apache.lucene.search.filter; import org.apache.lucene.search.indexsearcher; import org.apache.lucene.search.query; import org.apache.lucene.search.scoredoc; import org.apache.lucene.search.topdocs; import org.apache.lucene.search.highlight.formatter; import org.apache.lucene.search.highlight.fragmenter; import org.apache.lucene.search.highlight.highlighter; import org.apache.lucene.search.highlight.queryscorer; import org.apache.lucene.search.highlight.scorer; import org.apache.lucene.search.highlight.simplefragmenter; import org.apache.lucene.search.highlight.simplehtmlformatter; import org.apache.lucene.store.directory; import org.apache.lucene.store.fsdirectory; import org.apache.lucene.store.ramdirectory; import org.apache.lucene.util.version; /** * @author xudongwang 2012-2-10 * & nbsp;* email:xdwangiflytek@gmail.com */ PUBLIC&NBSP;CLASS&NBSP firstlucene03byhighlighter { /** * source file path */ private String filePath01 = "f:\\workspaces\\workspacese\\blogdemo\\lucenedatasource\\ HelloLucene01.txt "; /** * index path */ private String indexPath = " F:\\workspaces\\workspacese\\blogdemo\\luceneindex "; /** * word breaker, here we use the default word breaker , the standard Analyzer (several, but the support for Chinese is not good) */ private Analyzer analyzer = new standardanalyzer (version.lucene_35); private directory ramdir = null; /** * search * * @param querystr * Search Keywords * @throws exception */ public void search (STRING&NBSP;QUERYSTR) throws exception { // 1, parsing the text to be searched into a query object String[] fields = { "name", " Content " }; QueryParser queryParser = new Multifieldqueryparser (Version.lucene_35, fields, analyzer); query query = queryparser.parse ( QUERYSTR); // 2, querying indexreader indexreader = indexreader.open (RamDir); indexsearcher indexsearcher = new indexsearcher (Indexreader); Filter filter = null; TopDocs topDocs = Indexsearcher.search (query, filter, 10000); system.out.println ("A total of" " + topdocs.totalhits + "" matches results ");// note that the match result here is the number of documents, not the number of search results included in the document // Preparing the Highlighter &NBsp; Formatter formatter = new Simplehtmlformatter ("<font color= ' Red ' >", "</font>"); Scorer scorer = new Queryscorer (query); Highlighter highlighter = new Highlighter (formatter, scorer); fragmenter fragmenter = new simplefragmenter (x);// specify 10 characters highlighter.settextfragmenter (Fragmenter);// decide whether to generate a summary, and how long the summary // 3, take out data, and print results for (Scoredoc scoredoc : topdocs.scoredocs) { int docSn = scoreDoc.doc;// Document Internal numbering document document = indexsearcher.doc (DOCSN);// remove the corresponding document according to the document number // Highlight processing // returns the highlighted result, which returns null if no keyword appears in the current property value String highlighterStr = Highlighter.getbestfragment (analyzer, "Content", document.get ("content")); if (highlighterstr == null) { string content = document.get ("content"); Int endindex = math.min (20, content.length ()); Highlighterstr=content.substring (0, endindex);//Up to the first 20 characters } document.getfield ("Content"). SetValue (HIGHLIGHTERSTR); File2document.printdocumentinfo (document);// print out documentation Information } } /** &Nbsp; * optimization creates an index that exists in memory and disk with the use of * * @throws exception */ public void createindexbyyouhua () throws exception { file indexfile = new file (IndexPath) ; directory fsdir = fsdirectory.open ( Indexfile); // 1, on startup, reads the index from the disk into memory ramdir = new ramdirectory (FsDir); indexwriterconfig ramconf = new indexwriterconfig (Version.lucene_35, analyzer); // operating in-memory indexes indexwriter when running programs ramindexwriter = new indexwriter (ramdir, ramconf); Document document = File2document.file2document (FILEPATH01); ramindexwriter.adddocument (document); ramindexwriter.close (); // 2, saving an in-memory index to disk on exit IndexWriterConfig fsConf = new Indexwriterconfig (Version.lucene_35, analyzer); indexwriter fsindexwriter = new indexwriter (fsdir, fsconf); fsindexwriter.addIndexes (ramdir);// Merge all index data from several other index libraries into the current index library Fsindexwriter.commit (); // fsindexwriter.optimize ();//optimize the index file, Thus reducing IO operation fsindexwriter.forcemerge (1); fsindexwriter.close (); } public static Void main (String[] args) throws exception { firstlucene03byhighlighter lucene = new firstlucene03byhighlighter (); lucene.createindexbyyouhua (); lucene.search ("Iteye"); } }
Package Com.iflytek.lucene;import Java.io.file;import Org.apache.lucene.analysis.analyzer;import Org.apache.lucene.analysis.standard.standardanalyzer;import Org.apache.lucene.document.document;import Org.apache.lucene.index.indexreader;import Org.apache.lucene.index.indexwriter;import Org.apache.lucene.index.indexwriterconfig;import Org.apache.lucene.queryparser.multifieldqueryparser;import Org.apache.lucene.queryparser.queryparser;import Org.apache.lucene.search.filter;import Org.apache.lucene.search.indexsearcher;import Org.apache.lucene.search.query;import Org.apache.lucene.search.scoredoc;import Org.apache.lucene.search.topdocs;import Org.apache.lucene.search.highlight.formatter;import Org.apache.lucene.search.highlight.fragmenter;import Org.apache.lucene.search.highlight.highlighter;import Org.apache.lucene.search.highlight.queryscorer;import Org.apache.lucene.search.highlight.scorer;import Org.apache.lucene.search.highlight.simplefragmenter;import Org.apache.lucene.search.highligHt. Simplehtmlformatter;import Org.apache.lucene.store.directory;import Org.apache.lucene.store.fsdirectory;import Org.apache.lucene.store.ramdirectory;import org.apache.lucene.util.version;/** * @author Xudongwang 2012-2-10 * * Email:xdwangiflytek@gmail.com */public class Firstlucene03byhighlighter {/** * source file path */private String filePath01 = "F : \\Workspaces\\workspaceSE\\BlogDemo\\luceneDatasource\\HelloLucene01.txt "; /** * Index Path */private String Indexpath = "F:\\workspaces\\workspacese\\blogdemo\\luceneindex"; /** * Word breaker, here we use the default word breaker, the standard Analyzer (several, but the support for Chinese is not good) */Private Analyzer Analyzer = new StandardAnalyzer (version.lucene_35); Private Directory ramdir = null; /** * Search * * * @param querystr * Search keywords * @throws Exception * * * public void Search (String querystr) throws Exception {///1, parse the text to be searched into the query object string[] fields = {"Name", "content"}; Queryparser queryparser = new Multifieldqueryparser (version.lucene_35, fields, analyzer); Query query = Queryparser. Parse (QUERYSTR); 2, the query Indexreader Indexreader = Indexreader.open (Ramdir); Indexsearcher indexsearcher = new Indexsearcher (Indexreader); Filter filter = NULL; Topdocs Topdocs = indexsearcher.search (query, filter, 10000); System.out.println ("A total of" "+ Topdocs.totalhits +" "matches the result");//Note that the matching result here refers to the number of documents, not the number of search results contained in the document//Prepare the highlighter Formatter forma tter = new Simplehtmlformatter ("<font color= ' Red ' >", "</font>"); Scorer scorer = new Queryscorer (query); Highlighter highlighter = new Highlighter (formatter, scorer); Fragmenter fragmenter = new Simplefragmenter (10);//Specify 10 characters highlighter.settextfragmenter (fragmenter);//decide whether to generate a summary, As well as the summary how long//3, take out the data and print the result for (Scoredoc scoreDoc:topDocs.scoreDocs) {int DOCSN = scoredoc.doc;//Document Internal number Cument = Indexsearcher.doc (DOCSN);//The corresponding document is taken out according to the document number//to highlight//return the highlighted result, if the current attribute value does not appear in the keyword, will return a null String highlighters TR = highlighter.getbestfragment (Analyzer, "content", Document.get ("content")); if (Highlighterstr = = NULL) {String content = document.get ("content"); int endIndex = Math.min (+, content.length ()); Highlighterstr=content.substring (0, EndIndex);//Up to the first 20 characters} Document.getfield ("Content"). SetValue (HIGHLIGHTERSTR); File2document.printdocumentinfo (documents);//Print Out Document Information}}/** * Optimize the creation of indexes, the index exists in memory and disk with the use of * * @throws Exception */publi c void Createindexbyyouhua () throws Exception {file Indexfile = new File (Indexpath); Directory Fsdir = Fsdirectory.open (indexfile); 1. When starting, read the index in the disk into memory Ramdir = new Ramdirectory (fsdir); Indexwriterconfig ramconf = new Indexwriterconfig (version.lucene_35, analyzer); Run the program when the index in memory is indexwriter ramindexwriter = new IndexWriter (Ramdir, ramconf); Document document = File2document.file2document (FILEPATH01); Ramindexwriter.adddocument (document); Ramindexwriter.close (); 2. Save the In-memory index to disk when exiting Indexwriterconfig fsconf = new Indexwriterconfig (version.lucene_35, analyzer); IndexWriter fsindexwriter = new IndexWriter (Fsdir, fsconf); FsINdexwriter.addindexes (Ramdir);//merge all index data from several other index libraries into the current index library fsindexwriter.commit (); Fsindexwriter.optimize ();//optimize the index file to reduce the IO operation Fsindexwriter.forcemerge (1); Fsindexwriter.close (); } public static void Main (string[] args) throws Exception {Firstlucene03byhighlighter Lucene = new Firstlucene03byhighli Ghter (); Lucene.createindexbyyouhua (); Lucene.search ("Iteye"); }}
Operation Result:
There are a total of "1" matching results Name-->hellolucene01.txt Content-in <font color= ' Red ' >iteye</font> blog Path-->f:\workspaces\workspacese\blogdemo\lucenedatasource\hellolucene01.txt Size-->84 |