MySQL full-text index _ MySQL

Source: Internet
Author: User
Tags mysql tutorial
Full-text index is a FULLTEXT index in MySQL. FULLTEXT indexes are used for MyISAM tables. you can use ALTERTABLE or CREATEINDEX to create indexes in CHAR, VARCHAR, or TEXT columns at or after CREATETABLE. For large databases, load data to a table without FULLTEXT indexes, and then use ALTERTABLE (or CREATE full-text index in MySQL is a FULLTEXT index. FULLTEXT indexes are used for MyISAM tables. you can use alter table or create index to CREATE them on CHAR, VARCHAR, or TEXT columns at or after create table. For large databases, it is very fast to load data to a TABLE without FULLTEXT indexes and then CREATE an INDEX using alter table (or create index. Loading data to a table with FULLTEXT indexes will be very slow.

Full-text search is completed using the MATCH () function:

Mysql> create table articles (

-> Id int unsigned AUTO_INCREMENT not null primary key,

-& Gt; title VARCHAR (200 ),

-> Body TEXT,

-> FULLTEXT (title, body)

-> );

Query OK, 0 rows affected (0.00 sec)


Mysql> insert into articles VALUES

-> (NULL, 'MySQL Tutorial ', 'dbms stands for DataBase ...'),

-> (NULL, 'How To Use MySQL Efficiently ', 'After you went through ...'),

-> (NULL, 'Optimising mysql', 'In this tutorial we will show ...'),

-> (NULL, '1001 MySQL Tricks ', '1. Never run mysqld as root. 2 ....'),

-> (NULL, 'MySQL vs. yoursql', 'In the following database comparison ...'),

-> (NULL, 'MySQL security', 'When configured properly, MySQL ...');

Query OK, 6 rows affected (0.00 sec)

Records: 6 Duplicates: 0 Warnings: 0


Mysql> SELECT * FROM articles

-> Where match (title, body) AGAINST ('database ');

+ ---- + ------------------- + -------------------------------------------- +

| Id | title | body |

+ ---- + ------------------- + -------------------------------------------- +

| 5 | MySQL vs. YourSQL | In the following database comparison... |

| 1 | MySQL Tutorial | DBMS stands for DataBase... |

+ ---- + ------------------- + -------------------------------------------- +

2 rows in set (0.00 sec)

The function MATCH () searches for a string in a natural language against a text set (a column set that contains one or more columns in a FULLTEXT index. The search string is given as a parameter of AGAINST. Search to ignore uppercase/lowercase letters. MATCH () returns a correlation value for each record row in the table. That is, the similarity scale between the text of the specified column in the MATCH () list of the search string and Record Row.


When MATCH () is used in a WHERE clause (see the preceding example), the returned record rows are automatically sorted in the descending order of relevance. The correlation value is a non-negative floating point number. Zero correlation means they are not similar. Correlation is calculated based on the number of words in the Record Row, the number of unique words in the row, the total number of words in the set, and the number of documents (Record Row) that contain a special word.


It can also perform a logical search. This is described in the following chapter.


The preceding example describes the usage of the MATCH () function. Record rows are returned in descending order of similarity. The next example shows how to retrieve a specific similarity value. If neither the WHERE clause nor the order by clause exists, the returned rows are not sorted.


Mysql> SELECT id, MATCH (title, body) AGAINST ('utorial ') FROM articles;

+ ---- + ----------------------------------------- +

| Id | MATCH (title, body) AGAINST ('utorial ') |

+ ---- + ----------------------------------------- +

| 1 | 0.64840710366884 |

| 2 | 0 |

| 3 | 0.66266459031789 |

| 4 | 0 |

| 5 | 0 |

| 6 | 0 |

+ ---- + ----------------------------------------- +

6 rows in set (0.00 sec)

The following example is more complex. Query returns similarity and returns record rows in descending order of similarity. To complete this result, you should specify MATCH () twice. This will not cause additional overhead, because the MySQL Optimizer will notice two identical MATCH () calls and only call the full-text search code once.

Mysql> SELECT id, body, MATCH (title, body) AGAINST

-> ('Security implications of running MySQL as root') AS score

-> FROM articles where match (title, body) AGAINST

-> ('Security implications of running MySQL as root ');

+ ---- + ------------------------------------- + ----------------- +

| Id | body | score |

+ ---- + ------------------------------------- + ----------------- +

| 4 | 1. Never run mysqld as root. 2... | 1.5055546709332 |

| 6 | When configured properly, MySQL ...... | 1.31140957288 |

+ ---- + ------------------------------------- + ----------------- +

2 rows in set (0.00 sec)


MySQL uses a very simple parser to separate text into words. A word is any character sequence consisting of text, data, '', and. Any "word" that appears on the stopword list or is too short (3 characters or less) will be ignored.


Each appropriate word in a set and query is measured based on its importance in the set and query. In this way, a word that appears in multiple documents has a lower weight (or even a zero weight), because in this particular set, it has a lower semantic value. Otherwise, if the word is less, it will get a higher weight. Then, the word weight is combined to calculate the similarity of record rows.


Such a technical work can work well with a large set (in fact, it will be careful with it ). For very small tables, word classification is insufficient to fully reflect their semantic values. sometimes this pattern may produce strange results.


Mysql> SELECT * FROM articles where match (title, body) AGAINST ('mysql ');

Empty set (0.00 sec)


In the above example, the search term MySQL does not get any result because it appears in more than half of the record lines. Similarly, it is effectively processed as a stopword (that is, a word with zero semantic value ). This is the ideal behavior-a query in a natural language should not return each row (second row) from a 1 GB table ).


Words that match half of the record rows in a table may rarely find relevant documents. In fact, it may find many irrelevant documents. As we all know, this often happens when we try to search for something through a search engine on the Internet. For this reason, in this special dataset, such a row is set with a low semantic value.


By 4.0.1, MySQL can also use the in boolean mode modifier to perform a logical full-text search.


Mysql> SELECT * FROM articles where match (title, body)

-> AGAINST ('+ MySQL-yoursql' in boolean mode );

+ ---- + ------------------------------ + ------------------------------------- +

| Id | title | body |

+ ---- + ------------------------------ + ------------------------------------- +

| 1 | MySQL Tutorial | DBMS stands for DataBase... |

| 2 | How To Use MySQL Efficiently | After you went through a... |

| 3 | Optimising MySQL | In this tutorial we will show... |

| 4 | 1001 MySQL Tricks | 1. Never run mysqld as root. 2... |

| 6 | MySQL Security | When configured properly, MySQL... |

+ ---- + ------------------------------ + ------------------------------------- +

12 Next page

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.