Use the comma delimiter in the MySQL field

Source: Internet
Author: User
Tags mysql manual

Most developers should have encountered the experience of storing a comma-separated string in the MySQL field, and this field should have several commonalities, regardless of whether the segmented field represents an ID or tag.

    • The divided field must be finite and small, and we cannot store an infinite number of characters in a string.
    • The table that this field belongs to is associated with this field and must be a one-to-many relationship

For example, the following table structure represents the content and tag of the two objects

mysql> SELECT * FROM content;+----+------+| id | tags |+----+------+|  1 | 1,2  ||  2 | 2,3  in set (0.01 sec)mysql> SELECT * FROM tag;+----+-------+| id | name  |+----+-------+|  1 | php   ||  2 | mysql ||  3 | java  |+----+-------+3 rows in set (0.00 sec)

These principles, I believe you are already familiar with the development process. But when you use this method to deal with the actual problem, there must be a little bit disturbed in the heart, because this method more or less looks like a wild path . In that thick database textbook, there is no mention of this design approach, and the standard approach seems to be to use a relational mapping table to interpolate a taping between the two tables, although this would use inefficient connection queries.

Every developer has been obsessed with standards and efficiency, but I think our efforts will make the use of this approach look more standard. Note that the following discussion is used only for MySQL, but other databases should be portable.

Relevance Search

Many developers are still using the old like method to implement relevance retrieval, such as the database structure above, the content table in the two records have 2 this tag, then how to take out the record 1 o'clock, and the tag related to the record also show it. In fact, this is a CMS need to face a basic problem, that is, the relevant content of the query.

If you are a rookie, you may only think of the like method, such as first take the record 1 out, and then the tags field by commas, and finally do a loop with like to retrieve all the tags field in the content table contains 2 records, similar to

SELECT * FROM content WHERE tag LIKE ‘%2%‘ AND id <> 1

But this method is too slow, the number of queries do not say, like query is a relatively slow method. And you have to deal with the problem of commas, in short, the trouble is a lot.

So let's get down to the MySQL manual and see if there's any surprises. At this time, a function calledFIND in SET will flash the golden light into your eyes. Let's take a look at the definition of this function

Find_in_set (str,strlist) Returns a value in the range of 1 to N if the string str was in the string list strlist con Sisting of N substrings. A string list is a string composed of substrings separated by "," characters. If The first argument is a constant string and the second are a column of type SET, the Find_in_set () function is optimized To use bit arithmetic. Returns 0 if Str is not in strlist or if Strlist is the empty string. Returns null if either argument is null. This function does properly if the first argument contains a comma (",") character.

Oh, perfect!. Simply to find out if a string exists in another comma-separated string, it is simply tailored for us. Then our SQL will become

SELECT * FROM content WHERE FIND_IN_SET(‘2‘, tags) AND id <> 1

In the process of flipping through these functions, you should have been deeply aware of the MySQL designer's affirmation of separating the stored field methods with commas, because there are many ways to deal with this problem.

It looks so much better, everything seems perfect, isn't it? In fact, if you have more tags, you need to create multiple SQL statements, and some records associated with more tags, some less, how can be ranked according to relevance.

At this point, you can focus on MySQL's full-text search feature. This word you must have seen countless times, but it is very rarely used, let's look directly at the statement

SELECT * FROM content WHERE MATCH(tags) AGAINST(‘1,2‘) AND id <> 1

The advantage of this statement is obvious and you do not need to split the tags field again. So what is the principle of this query, a little understanding of the use of match against that the default delimiter for full-text indexing is punctuation and stopwords, where the former is the feature we need. The full-text index splits the strings in match and against by commas, and then matches them.

It is important to note that the above SQL is just an example, and if you do this directly, you will not get any results. Reasons in the following

    1. You need to build a fulltext index on the tags field (if it's just a test, you can do it, indexing just improves performance and has no effect on the results)
    2. Each punctuation word length must be more than 3 characters , this is the key, our tag ID is too short, will be automatically ignored, this time you can consider to let the ID from a relatively large value of the beginning of self-increment, such as 1000, so that it is long enough.
    3. You hit the stopwords, such as your tags field is such a ' hello,nobody ', nobody is a default of MySQL stop words, it will be automatically ignored. Stop words are some meaningless words in English, they are not needed when searching, like the auxiliary word in Chinese and so on. But in our use is obviously not used to do the search, so can be in the my.cnf file, plus ft_stopword_file= " to disable it

With the development of web technology, the relative search of SQL is getting less and more, many times only need to use search engine. But the purpose of this article is not only to discuss this method, but to embody the process of realizing this result.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.