Sharing _mysql in the Mysql field using the comma delimiter method

Source: Internet
Author: User
Tags mysql manual

The segmented fields must be finite and small in number, and we cannot store infinitely many characters in a string
The table that this field belongs to has a one-to-many relationship with the table associated with this field.
For example, the following table structure represents the content and tag these two objects

Copy Code code as follows:

Mysql> SELECT * from content;
+----+------+| ID | tags | +----+------+| 1 | 1,2 | | 2 | 2,3 | +----+------+
2 rows in Set (0.01 sec)
Mysql> SELECT * from tag;
+----+-------+| ID | name | +----+-------+| 1 | php | | 2 | MySQL | | 3 | java | +----+-------+
3 Rows in Set (0.00 sec)

These principles, I believe that we have been very familiar with the development process. But when you use this method to deal with the actual problem, there must be a little bit disturbed in the heart, because this method is more or less a bit like wild path. In the thick database textbook, there is no mention of this design method, the standard approach seems to be to use a relational mapping table between the two tables, although this will use inefficient connection queries.

Every developer has struggled with standards and efficiency, but I think our efforts will make the use of this approach seem more standard. Note that the methods discussed below are limited to MySQL, but other databases should be portable.

Relevance retrieval
Many developers are still using the old like method to implement relevance retrieval, such as the above database structure, the content table two records have 2 this tag, then how can I take out the record 1 o'clock, and its tag related records also show it. In fact, this is a CMS need to face a basic problem, that is, the relevant content of the query.

If you are a rookie, you may only think of like methods, such as the first record 1 out, and then the tags field by commas, and finally do a loop with like to retrieve all tags in the content table contains 2 of records, similar to
Copy Code code as follows:

SELECT * from content WHERE tag like '%2% ' and ID <> 1

But this method is too slow, the number of queries do not say, like query is a relatively slow method. And you have to deal with the comma before and after the problem, in short the trouble is a lot of.

So let's take a quiet look at the MySQL manual and see if there's any surprises. This time, a function called Find_in_set, will flash the golden light into your eyes. Let's take a look at the definition of this function

Copy Code code as follows:

Find_in_set (Str,strlist)
Returns a value in the range of 1 to n if the string str was in the string list strlist consisting of n substrings. A string list is a string composed of substrings separated by "," characters. If The argument is a constant string and the second is a column of type SET, the Find_in_set () function is optimized To the use bit arithmetic. Returns 0 If Str is isn't in strlist or if Strlist is the empty string. Returns null if either argument is null. This function is does not work properly if the argument contains a comma (",") character.

Oh, perfect!. The simple thing is to find out whether a string exists in another comma-delimited string, which is simply tailor-made for us. Then our SQL becomes
Copy Code code as follows:

SELECT * from content WHERE find_in_set (' 2 ', tags) and ID <> 1

In the process of flipping through these functions, you should have a deep sense of the MySQL designer's affirmation of the comma-separated storage field method, because there are many ways to design this problem.

It looks so much better, everything seems perfect, isn't it? In fact, if you have more tags, you need to create more than one SQL statement, and some records related to the tag more, some relatively few, how can be ranked according to the relevance.

At this point, you can focus on MySQL's Full-text search capabilities. This word you must have seen countless times, but the use of certainly very little, let us look directly at the statement
Copy Code code as follows:

SELECT * from content WHERE MATCH (tags) against (' 1,2 ') and ID <> 1

The advantage of this statement is obvious, you do not need to do the tags field to split again. So what is the principle of this query, a little understanding of the use of the match against know that the default delimiter for Full-text retrieval is punctuation and stopwords, which is what we need. Full-Text search divides the strings in match and against by commas and then matches them.

It should be noted that the above SQL is just an example, and if you do this directly, you can't get any results. Reasons in the following

    1. You need to set the Tags field fulltext index (if only the test, you can not do, build index just improve performance, no effect on the results)
    2. Each punctuation word length must be more than 3 characters , this is the key, our tag ID is too short, will be automatically ignored, this time you can consider to get the ID from a larger value from the start, such as 1000, so it is long enough.
    3. You hit a stopwords, such as your tags field is such ' hello,nobody ', nobody is a default stop words MySQL, it will be automatically ignored. Stop words are some meaningless words in English, they are not needed when searching, similar to the auxiliary word in Chinese and so on. But it's obviously not used in our search, so you can disable it in the my.cnf file, plus ft_stopword_file= ' .

With the development of web technology, the related search to go SQL less and more, often only need to use a search engine on it. But the purpose of this article is not only to discuss this method, but to embody the process of realizing this result.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.