Implementation of Like indexes in MySQL full-text search

Source: Internet
Author: User
When the LIKE condition of SQL is % XXX %, DBAs will tell you that the index cannot be used because the LIKE condition of SQL is % XXX %. When the data volume is large (for example, more than one million records ), full table scan may cause poor performance. However, in actual business, it is difficult to avoid the requirement of full-text MySQL search and Like index. For example, fuzzy search for user accounts and nicknames. Since this requirement

When the LIKE condition of SQL is % XXX %, DBAs will tell you that the index cannot be used because the LIKE condition of SQL is % XXX %. When the data volume is large (for example, more than one million records ), full table scan may cause poor performance. However, in actual business, it is difficult to avoid the requirement of full-text MySQL search and Like index. For example, fuzzy search for user accounts and nicknames. Since this requirement

When the LIKE condition of SQL is % XXX %, DBAs will tell you that the index cannot be used because the LIKE condition of SQL is % XXX %. When the data volume is large (for example, more than one million records ), full table scan may cause poor performance.

However, in actual business, it is difficult to avoid the requirement of full-text MySQL search and Like index. For example, fuzzy search for user accounts and nicknames. Since this requirement must be met, you cannot directly use LIKE. Here I will share with you a solution to this requirement. Of course, others may have used similar methods. I am not very clear. So use "original.

MySQL databases support full-text indexing very early, but the full-text indexing and LIKE statements are different. Specifically, the unit of full-text index is word, and the ear LIKE matches character. Of course, the actual difference is bigger. For example, if the text "mouse loves rice" is searched in full text, the conditions are "mouse loves rice", "Mouse and rice", and "Big Mickey Mouse ", "rice and mouse" will search for content, but "love", "rat Love", and "love big" won't search for content. If you use LIKE for search, "Mouse and rice", "Big Mickey Mouse", "Rice and mouse" won't find anything, but "love", "mouse love ", "Love Big" will find the content. We will not discuss the advantages and disadvantages of the two methods here. Each function has its own actual needs according to the actual situation. For example, full-text search is the best method for large text segments. However, for short text such as names, accounts, and nicknames, LIKE is more suitable.

Although full-text search is different from LIKE search, you can use the full-text search function to implement LIKE search in special cases. Specifically, each character is used as a word, and double quotation marks are used to limit exact match of words (simply put, mouse rice is different from Mickey Mouse). This enables LIKE search.

Let's talk about the specific practices.

First, the database specifies -- ft_min_word_len = 2 -- ft_stopword_file = "". The first parameter tells the database that words smaller than 2 characters are ignored. The second is to tell the database not to ignore any special words. These settings create conditions for implementing functions.

Create a search table

CREATE TABLE  tbl_search (  id int(10) unsigned NOT NULL auto_increment,  name varchar(500),  PRIMARY KEY  (id),  FULLTEXT KEY idx_name (name)) ENGINE=InnoDB AUTO_INCREMENT=1;

static String encode(String input) {if (input == null) return null;StringBuilder output = new StringBuilder();for (int i = 0, c = input.length(); i < c; ++i) {char ch = input.charAt(i);if (ch >= '0' && ch <= '9' || ch >= 'A' && ch <= 'Z' || ch >= '0' && ch <= '9' || ch >= 'A' && ch <= 'Z' || ch == '_' || ch == '-') {output.append(Integer.toHexString(ch)).append(' ');} else if (ch >= 'a' && ch <= 'z' || ch >= 'a' && ch <= 'z') {output.append(Integer.toHexString((int)ch - 32)).append(' ');} else {Character.UnicodeBlock block = Character.UnicodeBlock.of(ch);if (block == Character.UnicodeBlock.CJK_UNIFIED_IDEOGRAPHS|| block == Character.UnicodeBlock.KATAKANA|| block == Character.UnicodeBlock.HIRAGANA) {output.append(Integer.toHexString(ch)).append(' ');} else {// do nothing}}}// trim blankint last = output.length() - 1;if (last > 0 && output.charAt(last) == ' ') {output.deleteCharAt(last);}return output.toString();}

Use the above Code to encode the content to be searched. For example, if the content is "Bluetooth mouse 2008", the encoded result is "84dd 76ae 9f20 32 30 38 ". Store the encoded content to the name field.

Use the following SQL statement to search

select * from tbl_search where match(name) against('"76ae 9f20 32"' in boolean mode) 

In this way, the Like index in MySQL full-text search is basically implemented.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.