A simple example of "Lucene" three highlighted modules-highlighter

Source: Internet
Author: User

Lucene provides two implementations for highlighting, namely highlighter and Fastvectorhighlighter

The three examples here are all using highlighter;

Example code:

  1. Package Com.tan.code;
  2. Import Java.io.File;
  3. Import java.io.IOException;
  4. Import Java.io.StringReader;
  5. Import Org.apache.lucene.analysis.TokenStream;
  6. Import Org.apache.lucene.analysis.core.SimpleAnalyzer;
  7. Import org.apache.lucene.document.Document;
  8. Import Org.apache.lucene.index.DirectoryReader;
  9. Import Org.apache.lucene.index.IndexReader;
  10. Import Org.apache.lucene.index.Term;
  11. Import org.apache.lucene.queryparser.classic.ParseException;
  12. Import Org.apache.lucene.queryparser.classic.QueryParser;
  13. Import Org.apache.lucene.search.IndexSearcher;
  14. Import Org.apache.lucene.search.Query;
  15. Import Org.apache.lucene.search.ScoreDoc;
  16. Import Org.apache.lucene.search.TermQuery;
  17. Import Org.apache.lucene.search.TopDocs;
  18. Import Org.apache.lucene.search.highlight.Highlighter;
  19. Import org.apache.lucene.search.highlight.InvalidTokenOffsetsException;
  20. Import Org.apache.lucene.search.highlight.QueryScorer;
  21. Import Org.apache.lucene.search.highlight.SimpleHTMLFormatter;
  22. Import Org.apache.lucene.search.highlight.SimpleSpanFragmenter;
  23. Import org.apache.lucene.search.highlight.TokenSources;
  24. Import Org.apache.lucene.store.Directory;
  25. Import Org.apache.lucene.store.SimpleFSDirectory;
  26. Import org.apache.lucene.util.Version;
  27. Import Org.wltea.analyzer.lucene.IKAnalyzer;
  28. Public class Highlightertest {
  29. //Highlight the text (the following is purely fictitious)
  30. private String Text = "China has lots of people,most of them is very poor. is very big.  China become strong now,but The poor people are also poor than other controry ";
  31. //Original Highlight
  32. public void highlighter () throws IOException, invalidtokenoffsetsexception {
  33. Termquery termquery = new Termquery ("field", "China");
  34. Tokenstream Tokenstream = new Simpleanalyzer (version.lucene_43)
  35. . Tokenstream ("field", new StringReader (text));
  36. Queryscorer queryscorer = new Queryscorer (termquery);
  37. Highlighter highlighter = new highlighter (queryscorer);
  38. Highlighter.settextfragmenter (new Simplespanfragmenter (Queryscorer));
  39. System.out.println (Highlighter.getbestfragment (Tokenstream, text));
  40. }
  41. //Use CSS to highlight the handle
  42. public void Highlighter_css (String searchtext) throws ParseException,
  43. IOException, Invalidtokenoffsetsexception {
  44. //Create enquiry
  45. Queryparser Queryparser = new Queryparser (version.lucene_43, "field",
  46. new Simpleanalyzer (version.lucene_43));
  47. Query query = queryparser.parse (SearchText);
  48. //Custom callout highlighting text label
  49. Simplehtmlformatter htmlformatter = new Simplehtmlformatter (
  50. "<span style=\" backgroud:red\ ">", "</span>");
  51. //token of the cell
  52. Tokenstream Tokenstream = new Simpleanalyzer (version.lucene_43)
  53. . Tokenstream ("field", new StringReader (text));
  54. //Creative Queryscoer
  55. Queryscorer queryscorer = new Queryscorer (Query, "field");
  56. Highlighter highlighter = new Highlighter (Htmlformatter, queryscorer);
  57. Highlighter.settextfragmenter (new Simplespanfragmenter (Queryscorer));
  58. System.out.println (Highlighter.getbestfragments (tokenstream, Text, 4,
  59. "..."));  
  60. }
  61. //Highlight search results
  62. public void Highlighter_sr (String field, String searchtext)
  63. throws IOException, ParseException, invalidtokenoffsetsexception {
  64. //This example is for easy direct use of the index established by the previous experiment
  65. Directory directory = new Simplefsdirectory (new File ("E://myindex"));
  66. Indexreader reader = directoryreader.open (directory); //Read directory
  67. Indexsearcher search = new Indexsearcher (reader); Initializing the query component
  68. Queryparser parser = new Queryparser (version.lucene_43, field,
  69. New Ikanalyzer (true));
  70. Query query = parser.parse (SearchText);
  71. Topdocs td = Search.search (query, 10000); Gets a docid that matches the elements on the
  72. scoredoc[] sd = Td.scoredocs; //Load all documnet documents
  73. System.out.println ("This hit data:" + sd.length);
  74. Queryscorer scorer = New Queryscorer (query, "content");
  75. Highlighter highlighter = new highlighter (scorer);
  76. Highlighter.settextfragmenter (New Simplespanfragmenter (scorer));
  77. For (Scoredoc scoredoc:sd) {
  78. Document document = Search.doc (Scoredoc.doc);
  79. String content = document.get ("content");
  80. Tokenstream Tokenstream = Tokensources.getanytokenstream (
  81. Search.getindexreader (), Scoredoc.doc, "content", document,
  82. New Ikanalyzer (true));
  83. System.out.println (highlighter
  84. . getbestfragment (Tokenstream, content));
  85. }
  86. }
  87. }

Test code:

  1. @Test
  2. Public Void Test () throws IOException, Invalidtokenoffsetsexception,
  3. parseexception {
  4. //Fail ("not yet implemented");
  5. Highlightertest highlightertest = new Highlightertest ();
  6. Highlightertest.highlighter ();
  7. Highlightertest.highlighter_css ("China");
  8. Highlightertest.highlighter_css ("poor");
  9. HIGHLIGHTERTEST.HIGHLIGHTER_SR ("content", "moon Light Before Bed");
  10. }

Test results:

  1. <b>china</b> has lots of people,most of them is very poor. <b>china</b> is very big.   <b>china</b> become strong now,but The poor people is also poor than other controry
  2. <Spanstyle="Backgroud:red">china</span> has  lots of people,most of them are very poor. <span style= "backgroud:red" >china</ span> is very big. <span style= "Backgroud: Red ">china</span>  become strong now,but the poor people is also poor than  other controry  
  3. China has lots of people,most of them is very<span style= "backgroud:red" span class= "tag" >>poor</span>. China is very big. China become strong now,but the <span  style= "backgroud:red" >poor< span class= "tag" ></span> people is also < span class= "tag" ><span style= "backgroud:red" >poor</ span> than other controry  
  4. Hit data: 1
  5. <b> Bed </b><b> Pre </B><B> Bright Moon light </B, suspect is ground frost

A simple example of "Lucene" three highlighted modules-highlighter

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.