MySQL implementation of fuzzy query (regexp,like) there are 2 ways

Source: Internet
Author: User
Tags mysql in

One is to use like/not like,
The second is to use Regexp/not REGEXP (or Rlike/not rlike, they are synonyms).

First: Standard SQL pattern matching.

It has 2 wildcard characters: "_" and "%". "_" matches any single character, and "%" matches any number of characters (including 0).
Examples are as follows:

The code is as follows Copy Code

SELECT * FROM table_name WHERE column_name like ' m% '; #查询某字段中以 all records beginning with M or m
SELECT * FROM table_name WHERE column_name like '%m% '; #查询某字段中包含 all records for M or M
SELECT * FROM table_name WHERE column_name like '%m '; #查询某字段中以m或M结尾的所有记录
SELECT * FROM table_name WHERE column_name like ' _m_ '; #查询某字段中3个字符且m或M在中间的所有记录

What if we want to query the string that contains the wildcard character?
For example, 50% or _get.
The answer is: escape. Can be used to escape directly, or to escape by defining escape characters with escape, all just to escape the following character,

For example:

The code is as follows Copy Code

SELECT * FROM table_name WHERE column_name like '%50%% '; /* The 2nd% is escaped, query a field contains 50% of all records * *
SELECT * FROM table_name WHERE column_name like '%50/%% ' ESCAPE '/'; #第2个% is escaped
SELECT * FROM table_name WHERE column_name like '%_get% ' ESCAPE '/'; /* "_" is escaped, query a field contains _get all records * *


Second: pattern matching using extended regular expressions.

Let's look at the meaning of some characters of the extended regular expression:
“.” : Matches any single character
“?” : matches the preceding subexpression 0 or 1 times.
' + ': matches the preceding subexpression 1 or more times.
' * ': matches the preceding subexpression 0 or more times. X*, representing 0 or more x characters; [0-9]*, match any number of digits.
"^": indicates match start position.
"$": Indicates a matching end position.
' [] ': Represents a collection. [Hi], which means matching H or i;[a-d], to match any of A, B, C, D.
' {} ': Number of repetitions. 8{5}, which represents a match of 5 8, that is, 88888;[0-9]{5,11}, representing a match of 5 to 11 digits.

Let's look at an example:

The code is as follows Copy Code

SELECT * FROM table_name WHERE column_name REGEXP ' ^50%{1,3} ';

/* Query for all records that start with 50%, 50%, or 50%%% in a field.

Method three, if want to more advanced is Fulltext Full-text search

We can explain the process of full-text retrieval step-by-Step through examples:

Home we set up tables and initialize data

SQL code

The code is as follows Copy Code
CREATE TABLE IF not EXISTS ' category ' (
' ID ' int (a) not NULL auto_increment,
' FID ' int (a) not NULL,
' CatName ' char (255) not NULL,
' Addtime ' char (%) not NULL,
PRIMARY KEY (' id '),
Fulltext KEY ' catname ' (' catname ')
) Engine=myisam DEFAULT Charset=utf8 auto_increment=5;


INSERT into ' category ' (' id ', ' fid ', ' catname ', ' addtime ') VALUES
(1, 0, ' Welcome to you! ', ' 1263363380 '),
(2, 0, ' Hello phpjs,you are welcome ', ' 1263363416 '),
(3, 0, ' This are the fan site of you ', ' 1263363673 ');

CREATE TABLE IF not EXISTS ' category ' (' id ' int (a) NOT NULL auto_increment, ' FID ' int (a) not null, ' catname ' char (255) Not null, ' Addtime ' char ('% ') not NULL, PRIMARY key (' ID '), fulltext key ' CatName ' (' catname ')) Engine=myisam DEFAULT Char Set=utf8 auto_increment=5; INSERT into ' category ' (' IDs ', ' fid ', ' catname ', ' Addtime ') VALUES (1, 0, ' Welcome to you! ', ' 1263363380 '), (2, 0, ' Hello P Hpjs,you are welcome ', ' 1263363416 '), (3, 0, ' This are the fan site of you ', ' 1263363673 ');


Before specific examples, we analyze the syntax of MSYQL Full-text search: Function MATCH () performs a natural language search for a string against a text set (a column set of one or more columns contained in a fulltext index). The search string as a against () parameter is given. Searches are performed in a way that ignores the case of letters. The white line is match given a matching column (fulltext type index), against given the string to match, multiple spaces, punctuation, MySQL will automatically separate.


SQL code

The code is as follows Copy Code
SELECT * from ' category ' WHERE MATCH (catname) against (' Phpjs ')

return Result:

The code is as follows Copy Code

ID FID catname addtime
2 0 Hello phpjs,you are welcome 1263363416

Matches the row data that contains the PHPJS keyword.


2. SQL code

The code is as follows Copy Code
SELECT * from ' category ' WHERE MATCH (catname) against (' this ')



According to the above idea, the third row of data contains this, so should be able to match the third row of data, but the fact is very strange, return the result is empty, why?

The original is MySQL specified the minimum character length, the default is 4, must match more than 4 will have return results, you can use show VARIABLES like ' Ft_min_word_len ' to see the specified length of characters, can also be in the MySQL configuration file My.ini Change the minimum character length by adding a row to the My.ini, such as: Ft_min_word_len = 2, restart MySQL after the change.


3, here we want to make sure that the minimum character to 2, because 3 line records have ' you ', so thought, match ' you ' can return all the results

SQL code

The code is as follows Copy Code
SELECT * from ' category ' WHERE MATCH (catname) against (' you ')


Return the result is still empty, the big surprise, this is why?

The original MySQL in the set and query for each of the appropriate words will first calculate their weight, a word appearing in multiple documents will have a lower weight (may even have a 0 weight), because in this particular set, it has a lower semantic value. Otherwise, if the word is less, it will get a higher weight, the MySQL default threshold is 50%, above ' you ' appears in every document, so is 100%, only below 50% will appear in the result set.

4, some people will think, I do not go to tube weight size, as long as there is a match for me to return the result set, then how to do?

When MySQL comes to 4.0.1, you can use the in BOOLEAN MODE modifier to perform a logical full-text search

SQL code

The code is as follows Copy Code
SELECT * from ' category ' WHERE MATCH (catname) against (' your ' in BOOLEAN MODE)


Summary: 1, to pay attention to the minimum length of characters;

2, should pay attention to the keyword weight;

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.