MySQL Text Search

Source: Internet
Author: User

Full-Text Search

MySQL supports several basic database engines, but not all engines support full-text search. The two most commonly used engines are MyISAM and InnoDB, which support full-text search, which is not supported.

Understanding Full-Text Search

In the previous study, we all know that there are two ways to match text. One is to use the LIKE keyword for matching, and the other is to use regular expressions to match.

Although regular expressions are used, you can write enough complex matching patterns to find the rows you want. However, there are several important limitations to these effects:

1. Performance: Wildcard wildcards regular the expression tries to match all rows in the table, and these searches rarely use the table index, so it's quite slow

2, clear control: In the process of matching, we rarely know quite clearly what we want to match, what does not match.

3, has the priority result: when uses the wildcard wildcards regular the expression to match, only then returns the result which contains the match condition, but does not distinguish between the match degree (multiple matches and the individual match difference, in front matches and in the subsequent matching difference).

All of the limitations mentioned above can be resolved using full-text search.

Use full-Text search

For a full-text search, you must index the columns that are searched and re-index them as the data changes.

After the table is properly designed, MySQL automatically makes all indexes and re-indexes.

After the index, select can be used with match () and against () to actually perform the index.

So, how to enable full-text search support when building a table???

As follows:

The above is done by fulltext the corresponding index for the specified row, if you want to specify more than one column, you can put more than one column in Fulltext.

After the definition, MySQL automatically maintains the index. The index is updated whenever you add, update, or delete.

Make a full-text search

This table has the ability to search full text after the table is built, so let's start with a full-text search.
Before the index, I added the following to the Test_text table:

When we want to use full-text search to get data containing MySQL content.

As follows:

select * from test_text where match(content) against(‘MySQL‘);

The match (content) in the above statement instructs MySQL to search for the specified column, note that the value passed to match () must be the same as in the Fulltext () definition. Against (' MySQL ') specifies the word MySQL as the search text.

As you can see from the results, we got two rows of data including MySQL strings.

The above search can also be done using the like that we have learned, as follows:

The results of these two methods are the same. The order is the same, in theory, the use of full-text search will be the result of the priority to sort out the output, but from the results can be seen in the first statement in the location of MySQL in the second statement than the location of MySQL is still lagging, why appear in the front?? , and MySQL must know a little bit about the above

In order to verify the problem above, we can check the priority of the result. Before I looked at the priority, I added a row of data starting with MySQL. Let's take a look at the priority level below.

select id,content,match(content) against(‘MySQL‘) from test_text ;

As can be seen from the results, it is true that the priority of the id=7 row is greater than the priority of id=8, because the priority is not related to the occurrence of the position, but also related to factors such as the length of the text.

After adding a line of row data starting with MySQL, let's look at the results of both of these methods together.

As can be seen from the two diagram above, it is true that a full-text search is used to sort the contents of the search, while wildcards are not sorted. As for full-text search, the sort of search is determined by a number of factors, such as where it appears, the number of occurrences, the length of the text itself, and so on.

The higher the level, the more the position appears, the search engine we use in our life (such as Baidu, Google, etc.) searches for things, I believe it is also in accordance with such logic to appear results, of course, in addition to the Baidu company received other people's money to the mismatch of the content artificially put in front.

Using query extensions: With query expansion

The previous article is just plain text search.

However, we often have more stringent requirements in life, for example, I want to search for content including MySQL string, you also want to be related to MySQL content (even if there is no MySQL word), how to do it??

This requires us to use the query extension.

Usage: on against (keyword with query expansion)

As you can see from the results, when we use the query extension, we get more relevant content, even if the content doesn't include the MySQL word.

I'm not sure how MySQL is going to get the relevant keywords from MySQL.

Boolean Text Search

MySQL supports a different form of full-text search, called a Boolean method. In a Boolean way, you can provide details such as the following:

  • the words to match;
  • The words to be excluded,
  • Arrange hints (to specify that certain words are more important than others, and that more important words are of higher rank)
  • Expression Grouping.

To illustrate this: even if there is no Fulltext index can be used, but the speed is quite quite slow.

To show the role of in Boolean mode, a simple example is described.

select * from test_text where match(content) against(‘MySQL -interesting‘ in boolean mode);//返回必须包含MySQL不能包含interesting内容的数据。

For full-text Boolean operators, see the following table (from: "MySQL must Know" book)

The specific usage examples of full-text Boolean operators, similar to the example above, are not described here.

Instructions for using full-text search

1, MySQL with a built-in non-used word list. These words are always ignored when indexing full-text data. If necessary, you can override this list.

2, many words appear very high frequency, search they are not useful (because return too many results). Therefore, MySQL prescribes a 50% rule that if a word appears in more than 50% rows, it is ignored as a non-word. 50% rule is not used in Boolean mode.

3. If the number of rows in the table is less than 3 rows, the full-text search does not return a result. Therefore, each word is either not present and appears at least 50%.

4. Ignore the single quotation mark in the word. For example, the Don ' t index is dont.

MySQL Text Search

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.