Lucene09-lucene highlighting

Source: Internet
Author: User

Directory

    • 1 What is highlighting
    • 2 Highlighting implementation
      • 2.1 Configuring the Pom.xml file, adding highlight support
      • 2.2 Code implementation
      • 2.3 Customizing HTML tags highlighting
1 What is highlighting

Highlighting is a feature of full-text indexing, which refers to highlighting keywords in search results (bold and add color).

2 Highlighting implementation

Lucene provides a highlight component that supports highlighting.

2.1 Configuring the Pom.xml file, adding highlight support
<project>    <properties>        <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>        <!-- mysql版本 -->        <mysql.version>5.1.30</mysql.version>        <!-- lucene版本 -->        <lucene.version>4.10.3</lucene.version>        <!-- ik分词器版本 -->        <ik.version>2012_u6</ik.version>    </properties>        <dependencies>        // ......        <!--lucene高亮显示 -->        <dependency>            <groupId>org.apache.lucene</groupId>            <artifactId>lucene-highlighter</artifactId>            <version>${lucene.version}</version>        </dependency>    </dependencies></project>
2.2 Code implementation
    1. creates a score object (Queryscorer) that calculates the scoring for the highlighted content;
    2. The
    3. creates an output fragment object (Fragmenter) that slices the highlighted content;
    4. Create a highlighted Component Object (highlighter) for highlighting;
    5. Create parser Object (Analyzer) for Word segmentation;
    6. Use the Tokensources class to get the Stream object (Tokenstream) of the highlighted content;
    7. use the Highlighter object to finish highlighting.
/** * Encapsulates the search method (highlighted method) */private void Searcherhighlighter (query query) throws Exception {//print the Queries object generated by the querying syntax System.     OUT.PRINTLN ("Query syntax:" + query); 1. Create an index library directory location Object (directory) that specifies the location of the index Library Directory directory = Fsdirectory.open ("/users/healchow/documents/       Index "));       2. Create an index read object (indexreader) to read the index data into memory indexreader reader = directoryreader.open (directory);       3. Create index Search object (indexsearcher) to perform search Indexsearcher searcher = new Indexsearcher (reader); 4.  Perform a search using the Indexsearcher object, return the search result set topdocs//parameter one: Use the query object, parameter two: Specify the first n topdocs topdocs = searcher.search (query,) after sorting the search results to be returned       10); Add highlight processing ============================== start//1. Create a Score object (Queryscorer) to rate the highlighted content queryscorer qs = new Querysco     RER (query);       2. Create an output fragment object (Fragmenter) to slice the highlighted content fragmenter Fragmenter = new Simplespanfragmenter (QS);    3. Create a Highlight Component Object (highlighter) to enable highlighting highlighter lighter = new highlighter (QS); Set the Slice object Lighter.settextfragmenter (fragmEnter);       4. Establish the Analyzer object (Analyzer) for the word Analyzer Analyzer = new Ikanalyzer (); Added highlight processing ============================== end//5.       Processing result set//5.1 print the number of results actually queried System.out.println ("Number of results actually queried:" + topdocs.totalhits);      5.2 Gets the result array of the search//Scoredoc The ID of the document and its score scoredoc[] Scoredocs = Topdocs.scoredocs;        for (Scoredoc Scoredoc:scoredocs) {System.out.println ("= = = = = = = = = = = = = = = = = = = = =");        Gets the ID and rating of the document int docId = Scoredoc.doc;        FLOAT score = Scoredoc.score;           System.out.println ("Document Id=" + DocId + ", score =" + score);          Querying document data based on document ID-equivalent to querying data in a relational database based on primary key ID doc = Searcher.doc (docId);        System.out.println ("Book ID:" + doc.get ("bookId"));        Implementation of the name of the book highlighting String bookname = Doc.get ("BookName"); if (bookname! = null) {//5. Use the Tokensources class to get the highlighted content stream object (tokenstream)//Gettokenstream method: Gets the current document's flow pair Image//Parameter one: Current Document Object//Parameter two: the domain name to be highlighted           Parameter three: Parser object Tokenstream tokenstream = Tokensources.gettokenstream (Doc, "BookName", analyzer);            6. Use the highlight Component object to finish highlighting//Getbestfragment method: Get the highlighted result content//parameter one: Current Document object's stream object        Parameter two: target content to be highlighted bookname = Lighter.getbestfragment (Tokenstream, bookname);        } System.out.println ("Book Name:" + bookname);        System.out.println ("Book Price:" + doc.get ("Bookprice"));        System.out.println ("Book Picture:" + doc.get ("Bookpic"));    System.out.println ("Book Description:" + doc.get ("Bookdesc")); }//8. Close resource Reader.close ();}
/** * 测试高亮显示 需求:把搜索结果中,图书名称进行高亮显示(关键词值java) * @throws Exception  */@Testpublic void testHighlighter() throws Exception {    //1.创建查询对象    TermQuery tq = new TermQuery(new Term("bookName","java"));       // 2.执行高亮搜索    this.searcherHighlighter(tq);}

2.3 Customizing HTML tags highlighting
    • Question: In the actual project, how to implement the custom HTML tags, the search results highlighted?
      1. Create HTML tag formatting objects (Simplehtmlformatter);
      2. Creates a highlight Component Object (highlighter), specifying the use of the Simplehtmlformatter object.
// 增加高亮显示处理 ============================== start// 1.建立分值对象(QueryScorer), 用于对高亮显示内容打分QueryScorer qs = new QueryScorer(query);// 2.建立输出片段对象(Fragmenter), 用于把高亮显示内容切片Fragmenter fragmenter = new SimpleSpanFragmenter(qs);// 3.建立高亮组件对象(Highlighter), 实现高亮显示// 3.1.实现自定义的HTML标签进行高亮显示搜索结果// 1) 建立高亮显示HTML格式化标签对象(SimpleHTMLFormatter), 参数说明: // preTag: 指定HTML标签的开始部分(<font color='red'>)// postTag: 指定HTML标签的结束部分(</font>)SimpleHTMLFormatter formatter = new SimpleHTMLFormatter("<font color='red'>", "</font>");// 2) 指定高亮显示组件对象(Highter), 使用SimpleHTMLFormatter对象Highlighter lighter = new Highlighter(formatter, qs);// 设置切片对象lighter.setTextFragmenter(fragmenter);// 4.建立分析器对象(Analyzer), 用于分词Analyzer analyzer = new IKAnalyzer();// 增加高亮显示处理 ============================== end

Copyright notice

Author: Ma_shoufeng (Ma Ching)

Source: Blog Park Ma Ching's Blog

Your support is a great encouragement to bloggers, thank you for your reading.

The copyright of this article is owned by bloggers, welcome reprint, but without the blogger agreed to retain this paragraph statement, and in the article page obvious location to the original link, otherwise Bo Master reserves the right to pursue legal responsibility.

Lucene09-lucene highlighting

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.