MySQL full-text retrieval notes bitsCN.com
MySQL full-text retrieval Notes 1. MySQL 4.x and later provide full-text retrieval support, but the table storage engine type must be MyISAM,
The following is a TABLE creation SQL statement. Note that the storage engine type CREATE TABLE articles (id INT UNSIGNED AUTO_INCREMENT NOT NULL PRIMARY KEY, title VARCHAR (200), body TEXT, FULLTEXT (title, body) ENGINE = MyISAM default charset = utf8; FULLTEXT (title, body) creates a full-text index for the title and body columns, note that the two columns must be specified at the same time. 2. INSERT test data insert into articles (title, body) VALUES ('MySQL Tutorial ', 'dbms stands for DataBase... '), ('How To Use MySQL well', 'After you went through... '), ('optimizing mysql',' In this tutorial we will show... '), ('1001 MySQL Tricks', '1. never run mysqld as root. 2 .... '), ('MySQL. yourSQL ',' In the following database comparison... '), ('MySQL security', 'When configured properly, MySQL... '); 3. full-text search test
SELECT * FROM articles where match (title, body) AGAINST ('database ');
The results are as follows: 5 MySQL vs. YourSQL In the following database comparison... 1 MySQL Tutorial DBMS stands for DataBase... case insensitive for full-text matching. 4. possible troubles
So far it has been quite smooth, but what if I change the search SQL to the following?
SELECT * FROM articles where match (title, body) AGAINST ('well ');
As a result, I began to get confused for a long time. then I checked it online to find out that this was the case:
Mysql specifies the minimum character length. the default value is 4. the returned result must be matched to a value greater than 4. you can use show variables like 'FT _ min_word_len 'to view the specified character length, you can also go to the mysql configuration file my. ini to change the minimum character length, in the my. add a line for ini, for example, ft_min_word_len = 2. restart mysql after modification.
Therefore, the preceding results cannot be returned. However, I used the above method to modify the configuration file and restart the MySQL server, and then run the show command to check it. In addition, MySQL calculates the weight of a word to determine whether it appears in the result set. for details, mysql calculates the weight of each appropriate word in the set and query, A word that appears in multiple documents has a lower weight (or even a zero weight), because in this particular set, it has a lower semantic value. Otherwise, if the word is less, it will get a higher weight. the default mysql threshold value is 50%. the above 'you' appears in each document, so it is 100%, only less than 50% will appear in the result set.
But what should we do if we do not consider the weight? MySQL provides boolean fulltext search. if well appears in all records and ft_min_word_len is changed to 2, the result set of the following SQL SEARCH statement will contain all records: SELECT * FROM articles where match (title, body) AGAINST ('well' in boolean mode); 5. boolean full-text search syntax
In boolean mode is used to specify the full-text retrieval MODE as BOOLEAN full-text retrieval. MySQL also provides some similar syntaxes that we usually use when using search engines: Logic and, logic or, non-logic. Several SQL statement examples are used to describe
SELECT * FROM articles where match (title, body) AGAINST ('+ apple-banana' in boolean mode); + indicates AND, which must be included. -Indicates NOT, that is, NOT included. SELECT * FROM articles where match (title, body) AGAINST ('Apple bana' in boolean mode );
There is a space between apple and banana, and a space indicates OR, that is, at least one of apple and banana is included. SELECT * FROM articles where match (title, body) AGAINST ('+ apple banana' in boolean mode); must contain apple, but if it also contains banana, it will get a higher weight. SELECT * FROM articles where match (title, body) AGAINST ('+ apple ~ Banana 'in boolean mode );
~ Is an exclusive or operator that we are familiar. The returned record must contain apple, but if it also contains banana, the weight is reduced. However, it is not strict with apple-banana, because the latter does not return if it contains banana. SELECT * FROM articles where match (title, body) AGAINST ('+ apple + (> banana A. it is difficult to extend MySQL to support Chinese full-text search. A corresponding English index table is provided for the Chinese content table (the FULLTEXT index column is converted into every record in the English index table according to certain rules, for example, all are base64-encoded, the content table and the English index table have the same id). during retrieval, the same search term is converted to English and then used. If you want to support full-text retrieval by pinyin, you also need to add the corresponding Pinyin content to the index table (you need to convert Chinese to pinyin ). Of course, if you still need to support interactive searches between Chinese and English, for example, you also need to return William when searching William, and vice versa, you also need to save William's English translation to the index table.
Refer to the link on the internet. the specific practices include first segmenting Chinese content, then converting Chinese to four-bit code and saving it to the index table. During retrieval, the search term containing Chinese characters must also be segmented, converted to a four-bit location code, and then searched in the full text in the index table. 7. check entry. only tables whose storage engine type is MyISAM and whose MySQL version is 4. X or above, MySQL's built-in full-text search supports B. mySQL full-text search does not support Chinese characters by default, and case-sensitive C is ignored during English search. mySQL full-text search, the default search length is 4, that is, the keyword length must be greater than 5 to be captured D. mySQL full-text search, all FULLTEXT index columns must use the same character set E. the weight F is also taken into account when MySQL full-text search returns results. mySQL full-text search also supports flexible Boolean full-text search mode G. for more information, see the official MySQL5 manual.
Author: feichexiabitsCN.com
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.