Strategy for highlighting in Lucene or SOLR

Source: Internet
Author: User
Tags gettext solr


One: Functional background

In the near future to do a highlight of the search needs, has also been done. So there's no difficulty. Just the original used is Lucene, now to be replaced with SOLR, in the lucene4.x, the scattered fairy in the past in the article also analyzed how to achieve highlighting in the search, there are mainly three ways. For more details, please refer to the previous 2 articles of the Scattered Immortals:
First: How to achieve highlighting in Lucene4.3
http://qindongliang.iteye.com/blog/1953409
Second: How to highlight the service side in Solr4.3
http://qindongliang.iteye.com/blog/2034270



Second: the project inquiry

Generally speaking. There are 2 main ways to achieve, the first is the front desk to display data using JS highlighting, the second is the service side highlighted back to the front desk

The process of the back-end highlighting:



Front-end highlighting process:




Three: Pros and cons analysis

Back-end highlighting:
Performance: In the case of large concurrency, there may be some impact on the performance of the server.
Reliability: High, in the browser disable JS script case, still can display normally
Front-end highlighting:
Performance: rendered by client with slightly higher relative performance
Reliability: Low, in the browser disable JS script case, highlighting failure

Four: Precautions

When the foreground is highlighted, you need to put the phrase after the sentence. Back to the front desk JS, easy to replace, about the sentence participle, can use Lucene can also use SOLR, such as the following:
In Lucene:

Java code
  1. /***
  2.      * 
  3. * @param Analyzer word breaker
  4. * @param text sub-phrase
  5. * @throws Exception
  6.      */
  7. Public Static void Analyzer (Analyzer analyzer,string text)throws exception{
  8. Tokenstream ts = analyzer.tokenstream ("name", text);
  9. Chartermattribute Term=ts.addattribute (chartermattribute. Class);
  10. Ts.reset ();
  11. while (Ts.incrementtoken ()) {
  12. System.out.println (Term.tostring ());
  13. }
  14. Ts.end ();
  15. Ts.close ();
  16. }
/*** *  * @param analyzer Word breaker * @param text sub  -phrase * @throws Exception */public static void Analyzer (Analyzer analyze R,string text) throws exception{        tokenstream ts = analyzer.tokenstream ("name", text);        Chartermattribute Term=ts.addattribute (chartermattribute.class);        Ts.reset ();        while (Ts.incrementtoken ()) {            System.out.println (term.tostring ());        }        Ts.end ();        Ts.close ();}


In Solr, Mode 1:

Java code
  1. /***
  2. * Word segmentation based on field type and print word segmentation results
  3. * @param text
  4.  */
  5. Public Static void showanalysistype (String text)throws exception{
  6. String fieldtype="ik"; //Division of speech Type
  7. //Invoke service
  8. Fieldanalysisrequest request = new fieldanalysisrequest ("/analysis/field");
  9. //set type
  10. Request.addfieldtype (FieldType);
  11. //Set sentences to be participle
  12. Request.setfieldvalue (text);
  13. //sc=private static httpsolrclient sc=new httpsolrclient ("Http://localhost:8983/solr/one");
  14. //Get Results
  15. Fieldanalysisresponse Response =request.process (SC);
  16. //Get a corresponding analysis
  17. Analysis as = Response.getfieldtypeanalysis (FieldType);
  18. list<string> results = new arraylist<string> ();
  19. //Use the Guava library to convert the Iteratro object to a list object
  20. List<analysisphase> list=lists.newarraylist (As.getindexphases (). iterator ());
  21. //Take one of the Fitler's participle results, because a fieldtype is very likely configured with multiple filter. Every step passes
  22. the results of//filter are different, so here. To specify a filter that gets the result of the word breaker. With due regard to
  23. //So the scattered fairy here will write list.size-1. Note that the value here is not a fixed
  24. for (TokenInfo Token:list.get (List.size ()-1). Gettokens ()) {
  25. //Get Word segmentation data Results
  26. Results.add (Token.gettext ());
  27. }
  28. }
/*** * Word segmentation based on field type and print word result * @param text */public static void Showanalysistype (String text) throws Exception{string fieldtype= " IK ";//Division of Speech//Call service Fieldanalysisrequest request = new Fieldanalysisrequest ("/analysis/field ");// Set type Request.addfieldtype (FieldType);//Set the sentence to be participle request.setfieldvalue (text);//sc=private Static Httpsolrclient sc=    New Httpsolrclient ("Http://localhost:8983/solr/one");//Get Results Fieldanalysisresponse response =request.process (SC);    An analysis of the corresponding analysis as = Response.getfieldtypeanalysis (FieldType) is obtained;    list<string> results = new arraylist<string> (); Use the Guava library.        Convert the Iteratro object to a List object list<analysisphase> list=lists.newarraylist (as.getindexphases (). iterator ()); Take a fitler result, because a fieldtype is very likely to configure multiple filter, each step through//filter results are different, so here. To specify a filter that gets the result of a word breaker, as it relates//So the list.size-1 is written here, note that the value here is not a fixed for (TokenInfo Token:list.get (List.size ()-1). Gettok     ENS ()) {//Get Word segmentation data result Results.add (Token.gettext ()); }     }


In Solr, Mode 2:

Java code
  1. /***
  2. * According to the field of the word and print word results
  3. * @param text
  4.      */
  5. Public Static void showanalysis (String text)throws exception{
  6. //Here is the field name
  7. String fieldname="Cpyname";
  8. //fixed wording
  9. Fieldanalysisrequest request = new fieldanalysisrequest ("/analysis/field");
  10. //join field
  11. Request.addfieldname (FieldName);
  12. //Set sentences that require participle
  13. Request.setfieldvalue (text);
  14. //Request the SOLR service to get results
  15. Fieldanalysisresponse Response =request.process (SC);
  16. //package result, return, business processing that might be called by it
  17. list<string> results = new arraylist<string> ();
  18. //Get results based on field names
  19. Analysis As=response.getfieldnameanalysis (fieldName);
  20. //Using Guava Toolkit, turn iterator to list
  21. List<analysisphase> list=lists.newarraylist (As.getindexphases (). iterator ());
  22. //Print word breaker results
  23. for (TokenInfo Token:list.get (List.size ()-1). Gettokens ()) {
  24. System.out.println (Token.gettext ());
  25. }
  26. }
/*** * According to the field name and print the word result * @param text */public static void Showanalysis (string text) throws exception{//Here is the field names String fieldn Ame= "Cpyname"; Fixed notation fieldanalysisrequest request = new Fieldanalysisrequest ("/analysis/field"); Join field Request.addfieldname (FieldName); Set the sentence request.setfieldvalue (text) that requires participle; Request the SOLR service to get results     fieldanalysisresponse response =request.process (SC);     Encapsulates the result, returned. Business processing that may be called by     list<string> results = new arraylist<string> ();     Get results based on field name Analysis      as=response.getfieldnameanalysis (fieldName);     Use the Guava Toolkit. Turn iterator to List     list<analysisphase> list=lists.newarraylist (as.getindexphases (). iterator ());     Print the word segmentation result for     (TokenInfo token:list.get (List.size ()-1). Gettokens ()) {     System.out.println (Token.gettext ());     }     }



Finally welcome everybody sweep the code to follow the public number: I am the Siege division (WOSHIGCS). We learn, progress and exchange together! (Woshigcs)
The content of this public number is about search and big data technology and the Internet and other aspects of the sharing of content. is also a warm technical interactive communication of the small home, there are any problems at any time can leave a message, welcome everyone to visit!

Strategy for highlighting in Lucene or SOLR

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.