Lucene provides two implementations for highlighting, namely highlighter and Fastvectorhighlighter
The three examples here are all using highlighter;
Example code:
- Package Com.tan.code;
- Import Java.io.File;
- Import java.io.IOException;
- Import Java.io.StringReader;
- Import Org.apache.lucene.analysis.TokenStream;
- Import Org.apache.lucene.analysis.core.SimpleAnalyzer;
- Import org.apache.lucene.document.Document;
- Import Org.apache.lucene.index.DirectoryReader;
- Import Org.apache.lucene.index.IndexReader;
- Import Org.apache.lucene.index.Term;
- Import org.apache.lucene.queryparser.classic.ParseException;
- Import Org.apache.lucene.queryparser.classic.QueryParser;
- Import Org.apache.lucene.search.IndexSearcher;
- Import Org.apache.lucene.search.Query;
- Import Org.apache.lucene.search.ScoreDoc;
- Import Org.apache.lucene.search.TermQuery;
- Import Org.apache.lucene.search.TopDocs;
- Import Org.apache.lucene.search.highlight.Highlighter;
- Import org.apache.lucene.search.highlight.InvalidTokenOffsetsException;
- Import Org.apache.lucene.search.highlight.QueryScorer;
- Import Org.apache.lucene.search.highlight.SimpleHTMLFormatter;
- Import Org.apache.lucene.search.highlight.SimpleSpanFragmenter;
- Import org.apache.lucene.search.highlight.TokenSources;
- Import Org.apache.lucene.store.Directory;
- Import Org.apache.lucene.store.SimpleFSDirectory;
- Import org.apache.lucene.util.Version;
- Import Org.wltea.analyzer.lucene.IKAnalyzer;
- Public class Highlightertest {
- //Highlight the text (the following is purely fictitious)
- private String Text = "China has lots of people,most of them is very poor. is very big. China become strong now,but The poor people are also poor than other controry ";
- //Original Highlight
- public void highlighter () throws IOException, invalidtokenoffsetsexception {
- Termquery termquery = new Termquery ("field", "China");
- Tokenstream Tokenstream = new Simpleanalyzer (version.lucene_43)
- . Tokenstream ("field", new StringReader (text));
- Queryscorer queryscorer = new Queryscorer (termquery);
- Highlighter highlighter = new highlighter (queryscorer);
- Highlighter.settextfragmenter (new Simplespanfragmenter (Queryscorer));
- System.out.println (Highlighter.getbestfragment (Tokenstream, text));
- }
- //Use CSS to highlight the handle
- public void Highlighter_css (String searchtext) throws ParseException,
- IOException, Invalidtokenoffsetsexception {
- //Create enquiry
- Queryparser Queryparser = new Queryparser (version.lucene_43, "field",
- new Simpleanalyzer (version.lucene_43));
- Query query = queryparser.parse (SearchText);
- //Custom callout highlighting text label
- Simplehtmlformatter htmlformatter = new Simplehtmlformatter (
- "<span style=\" backgroud:red\ ">", "</span>");
- //token of the cell
- Tokenstream Tokenstream = new Simpleanalyzer (version.lucene_43)
- . Tokenstream ("field", new StringReader (text));
- //Creative Queryscoer
- Queryscorer queryscorer = new Queryscorer (Query, "field");
- Highlighter highlighter = new Highlighter (Htmlformatter, queryscorer);
- Highlighter.settextfragmenter (new Simplespanfragmenter (Queryscorer));
- System.out.println (Highlighter.getbestfragments (tokenstream, Text, 4,
- "..."));
- }
- //Highlight search results
- public void Highlighter_sr (String field, String searchtext)
- throws IOException, ParseException, invalidtokenoffsetsexception {
- //This example is for easy direct use of the index established by the previous experiment
- Directory directory = new Simplefsdirectory (new File ("E://myindex"));
- Indexreader reader = directoryreader.open (directory); //Read directory
- Indexsearcher search = new Indexsearcher (reader); Initializing the query component
- Queryparser parser = new Queryparser (version.lucene_43, field,
- New Ikanalyzer (true));
- Query query = parser.parse (SearchText);
- Topdocs td = Search.search (query, 10000); Gets a docid that matches the elements on the
- scoredoc[] sd = Td.scoredocs; //Load all documnet documents
- System.out.println ("This hit data:" + sd.length);
- Queryscorer scorer = New Queryscorer (query, "content");
- Highlighter highlighter = new highlighter (scorer);
- Highlighter.settextfragmenter (New Simplespanfragmenter (scorer));
- For (Scoredoc scoredoc:sd) {
- Document document = Search.doc (Scoredoc.doc);
- String content = document.get ("content");
- Tokenstream Tokenstream = Tokensources.getanytokenstream (
- Search.getindexreader (), Scoredoc.doc, "content", document,
- New Ikanalyzer (true));
- System.out.println (highlighter
- . getbestfragment (Tokenstream, content));
- }
- }
- }
Test code:
- @Test
- Public Void Test () throws IOException, Invalidtokenoffsetsexception,
- parseexception {
- //Fail ("not yet implemented");
- Highlightertest highlightertest = new Highlightertest ();
- Highlightertest.highlighter ();
- Highlightertest.highlighter_css ("China");
- Highlightertest.highlighter_css ("poor");
- HIGHLIGHTERTEST.HIGHLIGHTER_SR ("content", "moon Light Before Bed");
- }
Test results:
- <b>china</b> has lots of people,most of them is very poor. <b>china</b> is very big. <b>china</b> become strong now,but The poor people is also poor than other controry
- <Spanstyle="Backgroud:red">china</span> has lots of people,most of them are very poor. <span style= "backgroud:red" >china</ span> is very big. <span style= "Backgroud: Red ">china</span> become strong now,but the poor people is also poor than other controry
- China has lots of people,most of them is very<span style= "backgroud:red" span class= "tag" >>poor</span>. China is very big. China become strong now,but the <span style= "backgroud:red" >poor< span class= "tag" ></span> people is also < span class= "tag" ><span style= "backgroud:red" >poor</ span> than other controry
- Hit data: 1
- <b> Bed </b><b> Pre </B><B> Bright Moon light </B, suspect is ground frost
A simple example of "Lucene" three highlighted modules-highlighter