MySQL full-text index mysqlcft plug-in (1/5)

Source: Internet
Author: User

This article describes how to enable MySQL full-text Indexing Based on the mysqlcft plug-in, because mysql currently does not provide satisfactory support for Chinese fulltext, let's take a look at how to use the mysqlcft plug-in to enable your mysql to support Chinese indexes.

Features of MySQL full-text index plug-in mysqlcft:
1. Advantages:
① High Accuracy: the "Three-byte crossover splitting algorithm" is used to separate Chinese statements without a Chinese Word Segmentation dictionary. The search accuracy is far higher than the Chinese word segmentation algorithm, LIKE '%... %.
② Fast query speed: search speed is 3 ~ Faster than LIKE '%... % ~ 50 times. There are test results at the end of the article;
③ Standard plug-in type: developed in the form of a standard plug-in for full-text index of MySQL 5.1, without modifying the MySQL source code or affecting other functions of MySQL, you can quickly follow up the new version of MySQL;
④ Support multiple versions: Support all MySQL 5.1 Release Candidate versions, that is, MySQL 5.1.22 RC ~ The latest MySQL 5.1.25 RC;
⑤ Supported character sets: MySQL character sets including GBK, GB2312, UTF-8, Latin1, BIG5 are supported (other character sets have not been tested );
6. Good system compatibility: i386 and x86_64 are available, and 32-bit (i386) and 64-bit (x86_64) CPUs and Linux systems are supported;
7. Suitable for distributed: It is very suitable for the MySQL Slave distributed system architecture, with no dictionary maintenance cost and no dictionary synchronization problem.

2. Disadvantages:
①. Mysqlcft full-text index is only applicable to MyISAM tables, because MySQL only supports FULLTEXT index for MyISAM tables;
② MySQL cannot be statically compiled and installed; otherwise, mysqlcft plug-in cannot be installed;
③ The index file based on the "Three-byte crossover splitting algorithm" is slightly larger than the index file based on the "Chinese word segmentation algorithm", such as the massive index files such as ft-hightman. According to my tests, the. MYI index file of mysqlcft full-text index is 2 ~ of the. MYD data file ~ 5 times.


  Ii. Core Idea of mysqlcft-"Three-byte crossover splitting algorithm"

  

Note: This article takes 0 ~ 7 digit numbers represent "English", "Numbers", and "half Chinese characters" for illustration.
1. Split Chinese statements in three bytes to create a full-text index:
For example, the words "full-text index" or "one X-ray machine" will be split into six copies and reverse indexes will be created:
012 123 234 345 456 567

2. Split the search keywords by three bytes and find the corresponding information in the full-text index:
Example 1: Search for the keyword "wensuo", which is expressed as "2 ~ 5 ", it will be split:
234 345
In this way, it is matched with the full-text index.

Example 2: Search for the keyword "X-ray machine", which is "3 ~ 7 ", then it will be split:
345 456 567
In this way, it is also matched with the full-text index.

Example 3: Search for the keyword "1 Machine", which is "0 ~ 2 "and 4 ~ 7 ", then it will be split:
012 456 567
In this way, the multi-Keyword Search is also matched with the full-text index.

1 2 3 4 5

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.