1, first need to install good elasticsearch and Elasticsearch-analysis-ik word breaker
2. Configure IK synonyms
The Elasticsearch comes with a synonym filter named synonym. In order for IK and synonym to work together, we need to define a new analyzer and use IK as the tokenizer,synonym filter. It sounds complicated, but the only thing to do is add a configuration.
Open the/config/elasticsearch.yml file and add the following configuration:
[HTML]View PlainCopy
- Index
- Analysis
- Analyzer
- Ik_syno:
- Type:custom
- Tokenizer:ik_max_word
- Filter: [My_synonym_filter]
- Ik_syno_smart:
- Type:custom
- Tokenizer:ik_smart
- Filter: [My_synonym_filter]
- Filter
- My_synonym_filter:
- Type:synonym
- Synonyms_path:analysis/synonym.txt
The above configuration defines the two new analyzer Ik_syno and Ik_syno_smart, respectively, corresponding to IK's ik_max_word and ik_smart two word segmentation strategies. According to the IK documentation, the differences are as follows:
- Ik_max_word: The text will be the most fine-grained split, for example, "the national anthem of the People's Republic of China" will be divided into "People's Republic of China, the Chinese people, Chinese, Chinese, People's Republic, people, people, peoples, Republics, Republic, and, Juhua, national anthem", will exhaust all possible combinations;
- Ik_smart: The text will be the most coarse-grained split, such as "the national anthem of the People's Republic of China" will be split into "People's Republic of China, the national anthem";
Both Ik_syno and Ik_syno_smart use the synonym filter to implement synonym conversions.
3, create the/config/analysis/synonym.txt file, enter some synonyms and coexist in the utf-8 format. For example
To this synonym configuration has been completed, restart ES, search for the specified word is Ik_syno or ik_syno_smart.
Create a mapping map. Execute the Curl command as follows
[HTML]View PlainCopy
- Curl-xpost http://192.168.1.99:9200/goodsindex/goods/_mapping-d ' {
- "Goods": {
- "_all": {
- "Enabled": True,
- "Analyzer": "Ik_max_word",
- "Search_analyzer": "Ik_max_word",
- "Term_vector": "No",
- "Store": "false"
- },
- "Properties": {
- "title": {
- ' Type ': ' String ',
- "Term_vector": "With_positions_offsets",
- "Analyzer": "Ik_syno",
- "Search_analyzer": "Ik_syno"
- },
- "Content": {
- ' Type ': ' String ',
- "Term_vector": "With_positions_offsets",
- "Analyzer": "Ik_syno",
- "Search_analyzer": "Ik_syno"
- },
- "Tags": {
- ' Type ': ' String ',
- "Term_vector": "No",
- "Analyzer": "Ik_syno",
- "Search_analyzer": "Ik_syno"
- },
- "slug": {
- ' Type ': ' String ',
- "Term_vector": "No"
- },
- "Update_date": {
- "Type": "Date",
- "Term_vector": "No",
- "Index": "No"
- }
- }
- }
- }‘
The above code specifies the field characteristics for the article type under the test index: the title, content, and tags fields use Ik_syno as the analyzer, indicating that it uses Ik_max_word as a word breaker and applies synonym synonyms The slug field does not specify an analyzer, indicating that it uses the default word breaker, and that the Update_date field is not indexed.
Using Elasticsearch ik participle to implement synonym search (go)