Elasticsearch Chinese word segmentation (mmseg) -- manually add a dictionary

Source: Internet
Author: User

The Chinese word segmentation plug-in of elasticsearch is ineffective. You can manually add a dictionary to make up for it.

After many experiments, we found that mmseg adopts the forward longest matching algorithm. For example, for the word "small age", its dictionary does not contain the word, therefore, when a user searches for a small age, no results are found.

Under the guidance of my wife, I finally found a solution.

You can manually add the word to the dictionary of mmseg in either of the following ways:

1. Add the word to a built-in Dictionary (non-Pause dictionary), such as words-my.dic.

2. Create a new custom dictionary, put it into the default dictionary folder, note that the encoding format is in UTF-8 without BOM format encoding.

(Note: If elasticsearch is deployed in a cluster, the corresponding configuration file dictionary must be modified)

During this period, my wife of xianyu had been guiding me with good guidance and made me feel helpless. At the same time, I was very touched by setting out my favorite little longan for me.

I believe that after a few years, I will still feel the warmth of this evening when I try to solve the problem of manually adding word libraries.

Restart elasticsearch: elasticsearch restart

For the target file, re-create the index and search for "Small times" to query the target file.

Finally, I would like to thank my dear xianyu. Black Wolf !!!

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.