SOLR numeric characters cannot be searched for an issue

Last Update:2016-08-10 Source: Internet

Author: User

Tags solr solr query

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Question one: The testers tell me that numbers can't be searched. So I started looking for reasons:

* * *
<field name= "productName" type= "text" indexed= "true" stored= "true" />
* * *
</fields>

Fieldtype text configuration:
<fieldtype name= "text" class= "SOLR. TextField " positionincrementgap=",
   <analyzer type= "index";
  <tokenizer class= "SOLR. Lowercasetokenizerfactory "/>
  <filter class=" SOLR. Edgengramfilterfactory " mingramsize=" 1 " maxgramsize=" side= "front"/>
   </analyzer>
   <analyzer type= "Query";
  <tokenizer class= "SOLR. Lowercasetokenizerfactory "/>
  <filter class=" SOLR. Edgengramfilterfactory " mingramsize=" 1 " maxgramsize=" side= "front"/>
   </analyzer>
</fieldtype>

When a number character is included in my ProductName. For example, there is a product called ' Gaga 123 ' so you can't search by digital 1/2/3/12 and so on.

The same was true at the time of ' 123 Gaga '. For a long time did not find the reason. I do not know how to find this reason. So the side asked to spray oil. Conjecture is the problem of participle. So while looking at the management interface of SOLR to see what can be found?

Finally QQ Group in a buddy said Solr. Lowercasetokenizerfactory will filter out the numbers in the SOLR analysis menu and see a demo that can be participle is being configured for the current schema.xml. You can also choose the appropriate field to try to lowercasetokenizerfactory this guy's question. Then look for alternative solutions. After trying and searching. The following configuration

Finally solved the problem that the number cannot be searched. (the corresponding attribute is also changed to this type)

Because the products in our library have phonetic fields. And it's capitalized. If I use AMXL search can find the corresponding pinyin. The corresponding product is then searched for amoxicillin. (SOLR configures all queries.) The Pinyin field is copied to all. ）

But I can't search if I use AMXL. So I in the program SOLR query statement when the query value toUpperCase (); Finally solved the problem that the lowercase letters could not be searched.

Question two:

But the next day found a new problem introduced. If a product is ' d amoxicillin ' then I use D amoxicillin to search, will not be the ' d amoxicillin ' this product search out. At first I don't know why, put it in SOLR's analysis. Found out. My program has turned it into ' d amoxicillin ' for querying. But SOLR searched for ' d amoxicillin ', this time with all the lowercase letters. If you search with the full name of the product such as "amoxicillin" (auto-complete), you will not be able to search it out.

Solved the problem of numbers. The problem of a lowercase letter was encountered. I didn't find a plan for SOLR this time. So I intend to modify the program. The idea is to change the value of SOLR's query in the program to uppercase. If the value of the query has Chinese, the capitalization is not changed. If not, capitalize.

In that case. If the product has a number, or a lowercase letter can be searched out. The whole letter can also be searched according to pinyin. ("SOLR. Edgengramfilterfactory "mingramsize=" 1 "maxgramsize=" 50 "This is a word from left to right.

Then search the web for a regular lookup string whether there is Chinese:

/**     * Determine if a string contains Chinese     * @param str     * @return *     /Public    static Boolean Iscontainschinese (String str)         {            Matcher Matcher = Pattern.compile ("[\u4e00-\u9fa5]"). Matcher (str);        Boolean FLG = false;          if (Matcher.find ())    {                FLG = true;           }             return FLG;         }  public static string Toupperornot (String temp) {if (temp = = null) return ""; if (Stringutils.iscontainschinese (temp)) { return temp;} Else{return temp.touppercase ();}}

The next Toupperornot () is invoked where SOLR queries the value. It is best to invoke the following escape below.

Tip: SOLR Queries If there are special characters in the query value that need to be escaped:

public static Final Stringnead_to_convert_char= "([/:()!])";  /SOLR query need to convert meaningpublic static string Convertmeaningchar (String temp) {if (temp = = null) return ""; temp = Temp.replaceall (Nead_to_convert_char, "\\\\$1"); return temp;}

SOLR numeric characters cannot be searched for an issue

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More