MYSQL 3: Full Text Search

Source: Internet
Author: User


MYSQL 3: Full text search 1. Full text search www.2cto.com 1. MyISAM supports full text search, but InnoDB does not. 2. When using full text search, MySQL does not need to view each row separately, and does not need to analyze and process each word separately. MySQL creates an index for each word in a specified column and searches for these words. In this way, MySQL can quickly and effectively determine which words match, which words do not match, their matching frequency, and so on. 2. Use full text search 1. To perform full text search, you must index the columns to be searched and reindex them as data changes. After the table columns are properly designed, MySQL automatically performs all indexes and re-indexing. After the index, SELECT can be used with Match () and Against () for actual search. 2. enable full text search when creating a table. [SQL] create table productnotes (note_id int not nullauto_increment, note_text text null, primary key (note_id), fulltext (note_text) engine = MyISAM; after definition, mySQL automatically maintains the index. When a row is added, updated, or deleted, the index is automatically updated. 3. Do not use FULLTEXT when importing data. Www.2cto.com 4. Perform full text search Match () to specify the column to be searched, and Against () to specify the search expression to be used. [SQL] mysql> select * from productnotes-> whereMatch (note_text) Against ('designe '); + --------- + response + | note_id | note_text | + --------- + response ------------------------------------------------------ + | 6 | LimsLink I Sdesigned to interface output from chromatography data sy stems (CDSs) to LIMS. | 5 | This line ofproprietary reagents, containers, and automation tools is designed for genomics and drug discovery research. | + --------- + response ------------------------------------------------------ + 2 rows in set (0.03 sec) 5. The value passed to Match () must be in line with FULLTEXT () Same in definition. If multiple columns are specified, they must be listed (and in the correct order ). 6. Unless BINARY is used, full text search is case insensitive. [SQL] mysql> select * from productnotes-> where BINARYMatch (note_text) Against ('line '); + --------- + response + | note_id | note_text | + --------- + response -------------------------------------------------------- + | 5 | This li Ne ofproprietary reagents, containers, and automation tools is designed for genomics and drug discovery research. | + --------- + begin ------------------------------------------------------ + 1 row in set (0.05 sec) 7. sorting results is an important part of full text search. Rows with a higher level are returned first. The level is calculated by MySQL based on the number of words in the row, the number of unique words, the total number of words in the entire index, and the number of rows containing the word. In text, the level value of the row before the word is higher than the level value of the row after the word is backed up. [SQL] mysql> select note_id, Match (note_text) Against ('this line') as rank, note_text-> fromproductnotes-> whereMatch (note_text) Against ('this line '); + --------- + ---------------- + note_id + | note_id | rank | note_text | + --------- + ------------------ + ----------------------------- --------------------- ---------------------------------------------------------------------------- + | 5 | 0.81339610830754 | This line of proprietary reagents ,. containers, a nd automation tools is designed. for genomics and drugdiscovery. research. | 7 | 0.76517958501676 | specificities include both alpha-beta and beta-beta. this line from chromatography. data systems (CDSs) and toLIMS. | + ----- ---- + ------------------ + Rows + 2 rows in set (0.00 sec) 8. query extension www.2cto.com when using query extension, MySQL scans data and indexes twice to complete the search. First, perform a basic full text search to find all rows that match the search condition. Second, MySQL checks the matching rows and selects all useful words, mySQL re-performs a full text search. This time, not only the original conditions are used, but all useful words are used. The query extension can be used to find possible results, even if they do not exactly contain the words to be searched. The more rows in the table, the better the result returned by using the query extension. The query Extension function is introduced in MySQL4.1.1. [SQL] mysql> select note_id, Match (note_text) Against ('this line') as rank, note_text-> fromproductnotes-> where Match (note_text) against ('this line' with query expansion); + --------- + ------------------ + note_id + | rank | note_text | + --------- + ------------------ + -------- -------------------------------------- Bytes + | 5 | 0.81339610830754 | This line of proprietary reagents ,. containers, a nd automation tools is designed. for genomics and drugdiscovery. research. | 7 | 0.76517958501676 | specificities include both alpha-beta and beta-beta. this line from chromatography. data systems (CDSs) And toLIMS. | 3 | 0 | Human S-100. monoclonal. and polyclonal specifici ties include both alpha-beta and beta-beta isoforms. | 6 | 0 | LimsLink is. designed to interfaceoutput. from c hromatography. data systems (CDSs) and to LIMS. | 1 | 0 | PepTool allows users tostore, manage. analyze, a nd visualize protein data. | + --------- + ------------------ + -------------------------------------------- ------ Rows + 5 rows in set (0.00 sec) 9. boolean mode provides details about the following content: the words to be matched; words to be excluded from www.2cto.com; arrange the prompts; (specify some words are more important than other words) expression grouping; other content. [SQL] mysql> select note_id, note_text-> fromproductnotes-> whereMatch (note_text) Against ('line' in boolean mode ); + --------- + example ------------------------------------------------------- + | note_id | note_text | + --------- + example ------------------------------------------------------------------------------------------------------------ ------------------ + | 5 | This line ofproprietary reagents ,. containers, and automation tools is designed. for genomicsand drug discovery. research. | 7 | specificitiesinclude both alpha-beta and beta-beta. this line fro m chromatography. data systems (CDSs) and to LIMS. | + --------- + ---------------------------------------------------------------------------------------------------------------- -------------- + 2 rows in set (0.00 sec) can be searched using Boolean text even if there is no FULLTEXT index. But it is very slow. Mysql> select note_id, note_text/* matches line and does not contain systems */-> fromproductnotes-> whereMatch (note_text) Against ('line-systems * 'in boolean mode ); + --------- + example --------------------------------------------------------- + | note_id | note_text | + --------- + example ----------------------------------------------------------------------------------- ----------------------------------------- + | 5 | This line ofproprietary reagents ,. containers, and automation tools is designed. forgenomics and drug discovery. research. | + --------- + response ------------------------------------------------------- + 1 row in set (0.00 sec) mysql> select note_id, note_text/* matches line and systems */-> fromp Roductnotes-> whereMatch (note_text) Against ('+ line + system' in boolean mode ); + --------- + response + | note_id | note_text | + --------- + response --------------------------------------------------- + | 7 | specificitiesinclude B Oth alpha-beta and beta-beta. this line fro m chromatography. data systems (CDSs) and to LIMS. | + --------- + too many + 1 row in set (0.00 sec) mysql> select note_id, note_text/* match line or match systems */-> fromproductnotes-> whereMatch (note_text) against ('line system' in boolean mode); + ------ --- + Note_id + | note_text | + --------- + note_id + | 5 | This line ofproprietary reagents,. containers, and automation tools is designed. forgenomics and Drug discovery. research. | 6 | LimsLink is. designed to interface output. from chromatography. data systems (CDSs) and toLIMS. | 7 | specificitiesinclude both alpha-beta and beta-beta. this line fro m chromatography. data systems (CDSs) and to LIMS. | + --------- + --------------------------------------------------------------------- --------------------------------------------------- + 3 ro Ws in set (0.00 sec) mysql> select note_id, note_text/* matching phrase */-> fromproductnotes-> whereMatch (note_text) against ('"This line"' in boolean mode); + --------- + --------------------------------------------------------- + | note_id | note_text | + ------------------------------------------------------------------------------ -------------------------------------------- + | 5 | This line ofproprietary reagents ,. containers, and automation tools is designed. forgenomics and drug discovery. research. | 7 | specificitiesinclude both alpha-beta and beta-beta. this line fro m chromatography. data systems (CDSs) and to LIMS. | + --------- + ---------------------------------------------------------------------------------- ---------------------------------------- + 2 rows in set (0.00 sec) 10. Instructions for use l when indexing full text data, the short words are ignored and excluded from the index. Short words are defined as words with three or less characters on the face (This number can be updated if needed ). L MySQL has a built-in stopword list, which is always ignored when indexing full text data. If necessary, you can overwrite this list. L MySQL sets a 50% rule. If a word appears in more than 50% rows, it is ignored as a non-word. 50% rules are not used in boolean mode. L if the number of rows in the table is less than three rows, no results will be returned for full text search (because each word or does not appear, or at least appears in 50% rows ). L ignore single quotes in words. For example, the don't index is dont. L a language that does not have a word separator cannot properly return the full text search result.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.