MySQL Crash Course #12 # Chapter Full-text searching

Source: Internet
Author: User
Tags mysql version

INDEX

Becauseperformance, Smart resultsFor many reasons, when searching for text, Full-Text search one is better than the wildcard wildcards regular expression, which indexes the specified column to quickly locate the corresponding row, andSmart sorting of result sets。The start query extension canLet's get related lines that don't necessarily contain keywords,Enable Boolean mode You can let us specify what words should not be included in the search, the weights of each keyword, and so on.

  Full-text index usage instructions.

  Do not make full-text indexing before importing data.

WARNING

Not all database engines support full-text search. MyISAM supports full-text indexing, InnoDB does not support full-text indexing.

PS. It is said that MySQL version 5.6 and above InnoDB support full-text indexing, and the syntax format is similar to MyISAM's full-text index.

Understanding Full-text Searching

Wildcard wildcards regular expressions are powerful, but they have several serious drawbacks:

    • Performance: Wildcard wildcards regular the expression is always matched to each row in the table, so the two matching methods are time-consuming when the number of rows increases.
    • Precise control: The wildcards regular expression is difficult to precisely control what does not match what.
    • Smart results: For example, if there is a match or multiple matches in the field content, the wildcard wildcards regular returns the row equally, if the field content does not contain a match, but includes a related (similar) word that does not return the row.

All of these shortcomings can be solved by full-text indexing. When using full-text search, MySQL does not need to match each row individually, but instead creates the index of the word (in the specified column) so that MySQL can quickly and efficiently determine which words match, which mismatches, and so on.

Can be understood as MySQL for the specified row livings into a directory, the directory labeled XXX words containing the line is 1th, 4, 8, 9 lines (assuming), with this directory, search for xxx words can quickly find the corresponding line 1, 4, 8, 9, similar to the space for time.

Using Full-text Searching
CREATE TABLEproductnotes (note_idint            not NULLauto_increment, prod_idChar(Ten) not NULL, Note_datedatetime        not NULL, Note_texttext          NULL ,  PRIMARY KEY(note_id), Fulltext (Note_text)) ENGINE=MyISAM;

PS. Full-Text Indexing multiple fields are also possible!

Once the full-text index is defined, MySQL automatically maintains the index, and when the record is added and the record is deleted, the index changes accordingly.

Don ' t use fulltext when importing Data

Updating indexes takes timenot a lot of time, but time nonetheless. If you are importing data to a new table, you should not enable fulltext indexing at that time. Rather, first import all of the data, and then modify the table to define fulltext. This makes for a much faster data import (and the total time needed to index all data would be is less than the sum of the TIM E needed to index each row individually).

Performing Full-text Searches
Mysql> SELECTNote_text -  fromproductnotes - WHEREMatch (Note_text) against ('Rabbit');+------------------------------------------------------------------------------------------------------------- ---------+|Note_text|+------------------------------------------------------------------------------------------------------------- ---------+|Customer Complaint:rabbit has been able toDetect trap, food apparently less effective now.||Quantity varies, sold byThe sackLoad. AllGuaranteed toBe bright andOrange andSuitable for  Use  asRabbit bait.|+------------------------------------------------------------------------------------------------------------- ---------+2Rowsinch Set(0.00Sec

Match (Colunm_name must be consistent with the column defined for Fulltext) and against (' xxx '), remember these two functions! In addition, the search is case insensitive!

In fact, the use of wildcards can also be very convenient to achieve the above effect, but the performance and results will be different:

Mysql> SELECTNote_text -  fromproductnotes - WHERENote_text like '%rabbit%';+------------------------------------------------------------------------------------------------------------- ---------+|Note_text|+------------------------------------------------------------------------------------------------------------- ---------+|Quantity varies, sold byThe sackLoad. AllGuaranteed toBe bright andOrange andSuitable for  Use  asRabbit bait.||Customer Complaint:rabbit has been able toDetect trap, food apparently less effective now.|+------------------------------------------------------------------------------------------------------------- ---------+2Rowsinch Set(0.00Sec

We were shocked to find that although the number of rows is correct, but the order is not consistent ...

This is because full-text search has one important feature: result ranking (the ranking of results), ranking high priority return. To view the rank of a column:

SELECT Note_text,       Match (Note_text) against ('rabbit' as rank from Productnotes;

Rank the higher the ranking, this is related to whether the keywords appear or not, appear in the front or behind factors.

Using Query Expansion

When query expansion is used, MySQL makes-passes through the data and indexes to perform your search:

    • First, a basic Full-text search is performed to find all rows that match the search criteria.

    • Next, MySQL examines those matched rows and selects all useful words (we'll explain how MySQL figures out what's useful a nd what are not shortly).

    • Then, MySQL performs the full-text search again, this time using not just the original criteria, but also all of the Usefu L words.

Using Query Expansion You can therefore find results this might be relevant, even if they don ' t contain the exact words fo R which you were looking.

Mysql> SELECTNote_text -  fromproductnotes - WHEREMatch (Note_text) against ('Rabbit'  withQUERY EXPANSION);+------------------------------------------------------------------------------------------------------------ ----------------------------------------------+|Note_text|+------------------------------------------------------------------------------------------------------------- ---------------------------------------------+|Quantity varies, sold byThe sackLoad. AllGuaranteed toBe bright andOrange andSuitable for  Use  asRabbitBait.||CustomerComplaint: Rabbit has been able toDetect trap, food apparently less effective now.||  Customer complaint:circular holeinchSafe FloorCan apparently be easily cut withHandsaw.|| Customer complaint:sticks notIndividually wrapped, too easy toMistakenly detonate AllAtonce.                         Recommend individual wrapping. || Customer complaint: notHeavy Enough toGenerate flying stars around head ofVictim.IfBeing purchased forDropping, recommend ANV02orANV03 instead.||Multiple customerreturns, anvils failing to DropFast enoughorFalling backwards onPurchaser. Recommend that customer considers using heavier anvils.|+------------------------------------------------------------------------------------------------------------- ---------------------------------------------+6Rowsinch Set(0.00Sec
Boolean Text Searches
SELECT Note_text  from productnotes WHERE Match (Note_text) against ('+rabbit +bait "' in BOOLEAN MODE);

For more information, refer to the 18th chapter of the book, using various symbols to indicate priority of keywords, what words to include, what words to exclude.

Full-text Search Usage Notes

Important NOTE:

    1. MySQL does not index short words , and short words are defined by default to be within three letters.
    2. MySQL has a built-in stop word list , it's the same as the short word, of course, this can also be manually modified, specific methods can search for information, check the document.
    3. Many words appear so frequently that searching for them will be useless (returning too many results). As a result, MySQL grants 50% the rule that a word appears in 50% or more rows, which is treated as a deactivated word and is effectively ignored. (The 50% rule is not used in the in Boolean mode).
    4. If the rows in the table are less than three rows (that is, two or more rows, because each occurrence of a word is always at least 50% of the number of rows), the full-text search returns no results.
    5. single quotation marks within a word are ignored, for example, the index of don ' t is dont
    6. Languages that do not have a word delimiter, including Japanese and Chinese, will not return full-text results correctly.

So how to achieve Chinese word segmentation & Full-text index it? Pending update

MySQL Crash Course #12 # Chapter Full-text searching

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.