"Go" MySQL Beginner learning Three: full-text Search

Source: Internet
Author: User

Reprint Address: http://www.2cto.com/database/201212/173873.html

First, understand full-text search   www.2cto.com  1, MyISAM support full-text search, and InnoDB not supported.  2, when using full-text search, MySQL does not need to view each row separately, and does not need to parse and process each word separately. MySQL creates an index of the words in the specified column, and the search can be made for those words. This allows MySQL to quickly and efficiently determine which words match, which words do not match, how often they match, and so on.   Second, use full-text search  1, for full-text search, you must index the columns being searched, and will be re-indexed as the data changes. After the table column is properly designed, MySQL automatically makes all indexes and re-indexes.      After the index, select can be used with match () and against () to actually perform the search.  2, generally enables full-text search when tables are created.  [sql] create table productnotes   (   note_id int not nullauto_increment,    Note_ Text text null,    primary KEY (NOTE_ID),    Fulltext (note_text)  ) Engine=myisam;      After the definition, MySQL automatically maintains the index. The index is automatically updated when rows are added, updated, or deleted.  3, do not use fulltext when importing data.   www.2cto.com  4, full-text search      Match () specifies the column being searched, against () specifies the search expression to use.  [sql] mysql> SELECT * from Productnotes      Wherematch (note_text) against (' designed ‘);  +---------+--------------------------------------------------------------------- ------------------ ------------------------------------+  | note_id | Note_text                                () nbsp                      |  +---------+--------------------------------------------------------------------- ------------------ ------------------------------------+  |       6 | Limslink isdesigned to interface output from chromatography data sy  stems (CDSs) to LIMS.                                 |  |       5 | This line ofproprietary reagents, containers, and automation tools  is designed for genomics and drug discovery Resea Rch. |  +---------+--------------------------------------------------------------------- ------------------ ------------------------------------+  2 rows in Set (0.03 sec)  &The value nbsp;5, passed to match () must be the same as in the Fulltext () definition. If more than one column is specified, they must be listed (and in the correct order).  6, full-text search is not case-sensitive unless you use binary mode.  [sql] mysql> SELECT * from Productnotes     , where Binarymatch (Note_text) against (' Line ');  +---------+--------------------------------------------------------------------- ------------------ ------------------------------------+  | note_id | Note_text                                () nbsp                      |  +---------+--------------------------------------------------------------------- ------------------ ------------------------------------+  |       5 | This line ofproprietary reagents, containers, and automation tools  is designed for genomics and drug discovery Resea Rch. |  +---------+--------------------------------------------------------------------- ------------------------------------------------------+  1 row in Set (0.05 sec)   7, An important part of full-text search is sorting the results. Rows with higher levels are returned first.      level is calculated by MySQL based on the number of row morphemes, the number of unique words, the total number of morphemes for the entire index, and the number of rows that contain the word. Text morphemes The rank value of the previous row is higher than the rank value of the line following the word.  [sql] mysql> Select note_id, Match (note_text) against (' this line ') as Rank,note_text     - > Fromproductnotes      wherematch (note_text) against (' this line ');  +---------+------------------+-------------------------------------------------- ------------------ ----------------------------------------------------------+  | note_id | Rank            | Note_text                                () nbsp                          ,         &NB Sp        |  +---------+------------------+-------------------------------------------------- -------------------------------------------------------------- --------------+  |       5 |0.81339610830754 | This line of proprietary reagents,. Containers, a  ND automation tools is designed. For genomics and Drugdiscovery. |  |       7 |0.76517958501676 | Specificities include both Alpha–beta and Beta– beta. This line is from chromatography Data Systems (CDSS) and tolims.       |  +---------+------------------+-------------------------------------------------- ------------------ ----------------------------------------------------------+  2 rows in Set (0.00 sec)   8, query extensions   www.2cto.com       When using the query extension, MySQL scans the data and the index two times to complete the search.      First, make a basic full-text search to find all the lines that match the search criteria;     Second, MySQL checks these matching rows and selects all the useful words;     Again, MySQL makes full-text searches again, this time not only using the original conditions, but also using all the useful words.      Use of query extensions to find possible relevant results, even if they do not accurately contain the word you are looking for. The more rows in the      table, the better the results returned using the query extension.   Query extension is introduced in MySQL4.1.1.  [sql] mysql> Select note_id, Match (note_text) against (' this line ') as Rank,note_text     - > Fromproductnotes     , where Match (Note_text) against (' this line ' with query expansion);  +---------+------------------+-------------------------------------------------- ------------------ ----------------------------------------------------------+  | note_id | Rank            | Note_text                                () nbsp                          ,         &NB Sp        |  +---------+------------------+-------------------------------------------------- ------------------ ----------------------------------------------------------+  |       5 | 0.81339610830754| This line of proprietary reagents,. Containers, a  ND automation tools is designed. For genomics and Drugdiscovery. |  |       7 |0.76517958501676 | Specificities include both Alpha–beta and Beta– beta. This line is from chromatography Data Systems (CDSS) and tolims.       |  |       3 |                0 | Human S-100. Monoclonal.and polyclonal Specifici  ties include both Alpha–beta and Beta–beta isoforms.                      |  |       6 |                0 | Limslink is. Designed to Interfaceoutput. From C  hromatography. Data Systems (CDSs) and to LIMS.                             |  |       1 |                0 | Peptool allows users Tostore, manage. Analyze, a  nd visualize protein data.                          ,         &NB Sp             |  +---------+------------------+-------------------------------------------------- ------------------ ----------------------------------------------------------+  5 rows in Set (0.00 sec)   9, Boolean Text Search (Boolean mode)      In Boolean, you can provide details about the following:     words to match;  www.2cto.com        words to exclude;     arrange hints; (Specifies that certain words are more important than others)      expression grouping;     other content.  [sql] mysql> Select Note_id,note_text      fromproductnotes     - > Wherematch (note_text) against (' line ' in Boolean mode);  +---------+--------------------------------------------------------------------- ---------------------------------------------------------+  | note_id | Note_text                                () nbsp                          |  +---------+--------------------------------------------------------------------- ------------------ ---------------------------------------+  |       5 | This line ofproprietary reagents,. containers, and automation tools   is designed. For Genomicsand drug discovery. |  |       7 | Specificitiesinclude both Alpha–beta and Beta–beta. This line fro  m chromatography. Data Systems (CDSs) and to LIMS.       |  +---------+--------------------------------------------------------------------- ------------------ ---------------------------------------+  2 rows in Set (0.00 sec     You can use a Boolean text search, even if you don't have a fulltext index. But very slowly.  mysql> Select note_id,note_text/* matches line and does not include systems*/    -Fromproductnotes      Wherematch (Note_text) against (' line-systems* ' in Boolean mode);  +---------+--------------------------------------------------------------------- ------------------ ---------------------------------------+  | note_id | Note_text                                () nbsp                         |  +---------+--------------------------------------------------------------------- ------------------ ---------------------------------------+  |       5 | This line ofproprietary reagents,. containers, and automation tools   is designed. Forgenomics and drug discovery. |  +---------+--------------------------------------------------------------------- ---------------------------------------------------------+  1 row in Set (0.00 sec)     mysql> select note_id,note_text/* match line and match systems*/   &nbs P Fromproductnotes     , Wherematch (Note_text) against (' +line +systems ' in Boolean mode);  +---------+--------------------------------------------------------------------- ------------------ ---------------------------------+  | note_id | Note_text                                () nbsp                   |  +---------+--------------------------------------------------------------------- ------------------ ---------------------------------+  |       7 | Specificitiesinclude both Alpha–beta and Beta–beta. This line fro  m chromatography. Data Systems (CDSS) and to Lims. |  +---------+--------------------------------------------------------------------- ------------------ ---------------------------------+  1 row in Set (0.00 sec)     mysql> Select note_id,note_text/* Match line or match systems*/     fromproductnotes      Wherematch (note_text) against (' Line Systems ' in Boolean mode);  +---------+--------------------------------------------------------------------- ------------------ ---------------------------------------+  | note_id | Note_text                                () nbsp                         |  +---------+--------------------------------------------------------------------- ------------------ ---------------------------------------+  |       5 | This line ofproprietary reagents,. ContaIners, and automation tools   is designed. Forgenomics and drug discovery. |  |       6 | Limslink is.designed to interface output. From chromatography. Data   systems (CDSs) and tolims.                             |  |       7 | Specificitiesinclude both Alpha–beta and Beta–beta. This line fro  m chromatography. Data Systems (CDSs) and to LIMS.       |  +---------+--------------------------------------------------------------------- ------------------ ---------------------------------------+  3 rows in Set (0.00 sec)     mysql> Select Note_id,note _text/* Match phrase */    -fromproductnotes      Wherematch (note_text) against (' "This Line "' in Boolean mode);  +---------+--------------------------------------------------------------------- ---------------------------------------------------------+  | note_id | Note_text                                () nbsp                         |  +---------+--------------------------------------------------------------------- ------------------ ---------------------------------------+  |       5 | This line ofproprietary reagents,. containers, and automation tools   is designed. Forgenomics and drug discovery. |  |       7 | Specificitiesinclude both Alpha–beta and Beta–beta. This line fro  m chromatography. Data Systems (CDSs) and to LIMS.       |  +---------+--------------------------------------------------------------------- ------------------ ---------------------------------------+  2 rows in Set (0.00 sec)  10, usage notes  l   When indexing full-text data, short words areIgnored and excluded from the index. Short words are defined as words that have 3 or more characters on the face (this number can be updated if needed).  l  mysql comes with a list of built-in non-use words (Stopword) that are always ignored when indexing full-text data. If necessary, you can override this list.  l  mysql prescribes a 50% rule that if a word appears in more than 50% rows, it is ignored as a non-term. 50% rule is not used in BOOLEAN MODE.  l   If the number of rows in the table is less than 3 rows, the full-text search does not return a result (because each word either does not appear, or at least appears in the 50% row).  l   ignores single quotes in words. For example, the Don ' t index is dont.  l   Languages that do not have Word separators do not properly return full-text search results.
Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.