MySQL Full-text Indexing application brief tutorial

Source: Internet
Author: User

This article introduces the basics of MySQL full-text indexing in the following ways:
    1. Several considerations for MySQL full-text indexing
    2. Syntax for full-text indexing
    3. Introduction to several types of searches
    4. Examples of several search types
Several considerations for full-text indexing
  • The search must be on an indexed column of type Fulltext, and the column specified in match must be specified in fulltext
  • Can only be applied in tables with table engine MyISAM type (MySQL 5.6 can also be used in the InnoDB table engine)
  • You can only create a full-text index above columns of char, varchar, text type
  • Like a normal index, you can specify it when you define a table, or you can add or modify a table after you create it
  • For a large order of record inserts, creating an index after inserting data into a table that has no index is much faster than inserting it into an indexed data table
  • The search string must be a constant string and cannot be the column name of the table
  • There is no match (limited only in natural search) when the search record has more than 50% selectivity
Full-text index search syntax

MATCH (column name 1, column name 2,...) Against (search string [search modifier])

The column names specified in Match 1, 2, and so on, are the names of the columns specified in the full-text index, followed by the search modifiers described below:

Search_modifier:

{

In NATURAL LANGUAGE MODE

| In NATURAL LANGUAGE MODE with QUERY EXPANSION

| In BOOLEAN MODE

| With QUERY EXPANSION

}

Introduction to several types of searches

The above search modifier, which actually illustrates 3 types of full-text search

In NATURAL LANGUAGE MODE

Summary: Default search form (without any search modifier or modifier as in NATURAL LANGUAGE MODE)

Characteristics:

    1. The characters in the search string are resolved to normal characters, with no special meaning
    2. Filtering a string in a masked character list
    3. When the selectivity of a record exceeds 50%, it is usually considered a mismatch.
    4. The return record is sorted according to the relevance of the record
In BOOLEAN MODE

Summary: Boolean mode search (in case the search modifier is in Boolean mode)

Characteristics:

    1. The meanings of special characters in the search string are parsed according to certain rules, and some logical meanings are made. such as: A word must appear, or can not appear and so on.
    2. Records returned by this type of search are not sorted by relevance

With QUERY EXPANSION

Introduction: A slightly more complex form of search, in fact, 2 natural searches, you can return records of the direct introduction of the record, modifier in NATURAL LANGUAGE MODE with query EXPANSION or with query EXPANSION Modifier

Features: This type of search actually provides an indirect searching function, such as: I search for a word, and the first row returned does not contain any strings in the search term. A second match can be made based on the first search results, which may result in a matching record of some indirect relationships.

Introduction to several examples of search types

In NATURAL LANGUAGE mode application:

Or is it applied in the product table, where we have a full-text index in the Name field, because I need to match the relevant record in the name column according to the keyword

The SQL statements are as follows:

SELECT * from product WHERE match (name) against (' auto ')

Time is also good, in nearly 87w record hit 1w More, time 1.15 seconds, the effect is good

Note: By default, the record is returned from high to low based on relevance.

We can select match (name) against (' auto ') from product to view the record's correlation values, the values are between 0 and 1, and 0 means the record mismatch

A few important features:

1. Which words are ignored

Search term is too short the default full-text index thinks that a word of more than 4 characters is a valid word, and we can modify Ft_min_word_len to configure it in the configuration

Block words in the thesaurus the default full-text index masks Some common words, because they are too common to have any semantic effect, so the search process is negligible. Of course, this list is also configurable.

2. How to make a participle

A full-text index considers a contiguous valid character (the character set in the regular \w match) to be a word, or it can contain a "'", but a contiguous two ' will be considered a delimiter. Other separators such as: spaces, commas, periods, etc.

In BOOLEAN mode applications:

In Boolean matching mode, we can add some special symbols to increase the logic function of the search process. Examples provided on the official website (search for statements that contain MySQL strings and do not contain yousql):

SELECT * from articles WHERE MATCH (title,body)

-Against (' +mysql-yoursql ' in BOOLEAN MODE);

As we can see, we have more control over the search, which looks "tall".

In fact, the above operation implies a few meanings:

Plus: equivalent to and

Minus: equivalent to not

No: equivalent to or

Here are a few important features of Boolean type search:

1. There is no 50% record selectivity limit, even if the search results record more than 50% of the total will also return results

2. Does not automatically sort in descending order by record relevance

3. Can be applied directly on the full-text index without creating fulltext, but this will be very slow query, so it is not used.

4. Support minimum and maximum word length

5. Apply a screen word list

Boolean Search Support operators:

N Plus +: Indicates that the decorated word must appear in the record

N minus-: Indicates that the decorated word must not appear in the record

n does not have any operator: the word is optional, but the record that contains the word is highly correlated

N Double quotes ": a phrase as a match. For example, "one word" matches a word in one word

Here are some examples of official:

A record that contains at least one word

' Apple Banana '

Must contain two words.

' +apple +juice '

Must contain Apple, record correlation with the Macintosh, or it may not contain

' +apple Macintosh '

Must contain Apple and cannot shout a Macintosh

' +apple-macintosh '

Find records for Apple's opening words

' Apple* '

Full match some words word

' "Some words" '

Understand the basic MySQL full-text indexing knowledge, think its full-text index than like of course is a lot stronger. But the high-level search is still a bit crude, and performance issues are worrying.

I am only as a beginner to understand, but also to the official website some basic knowledge of translation.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.