The Nineth chapter searches with regular expressions
This chapter will learn how to use regular expressions within the MySQL WHERE clause to better control data filtering.
9.1 Introduction to Regular expressions
The filtering examples in the first two chapters allow for data to be found using matching, comparison, and wildcard operators. For basic filtering (or even some less basic filtering), this is enough. However, as the complexity of filtering conditions increases, the complexity of the WHERE clause itself is also necessary to increase. This is where the regular expression becomes useful. Regular expressions are special strings (character sets) that are used to match text. If you want to extract a phone number from a text file, you can use a regular expression. If you need to find all the files with numbers in the middle of the name, you can use a regular expression. If you want to find all the duplicated words in a block of text, you can use a regular expression. If you want to replace all URLs in a page with the actual HTML links for those URLs, you can also use a regular expression (for the last example, or two regular expressions). Regular expressions are supported for all kinds of programming languages, text editors, operating systems, and so on. Knowledgeable programmers and network administrators have been focusing on regular expressions that are important content of their technical tools for a long time. Regular expressions are established using regular expression language, which is a special language used to accomplish all the work that has just been discussed and more work. As with any language, regular expressions have special syntax and instructions that you must learn.
Learn more content full coverage of regular expressions is beyond the scope of this book. Although the basics are described here, a more thorough introduction to regular expressions may also need to refer to the author's "Regular Expressions must know"
9.2 using the MySQL regular expression
So, what does regular expression have to do with MySQL? As has been said, the function of a regular expression is to match text, comparing a pattern (regular expression) with a text string. MySQL provides preliminary support for regular expressions with a WHERE clause, allowing you to specify a regular expression to filter the data retrieved by select.
Only a subset of the regular expression language if you are familiar with regular expressions, be aware that MySQL supports only a small subset of the implementations of most regular expressions. This chapter describes most of the content that MySQL supports. Let's give a few examples to more clearly describe the concept of regular expressions.
9.2.1 Basic character Matching
Let's start with a very simple example. The following statement retrieves a column prod_name all lines that contain text 1000:
In addition to the keyword like being replaced by the regexp, this statement looks very similar to the one using the likes statement (8th chapter). It tells Mysql:regexp what to do with a regular expression (a regular expression that matches the text body 1000).
Why bother using regular expressions? In the previous example, the regular expression did not bring much benefit (and might degrade performance), however, consider the following example:
The regular expression is used here. 000: A special character in the regular expression language. It means matching any one character, so 1000 and 2000 match and return. Of course, this particular example can also be done with like and wildcard characters (see Chapter 8th).
There is an important difference between like and regexp in the similar and regexp. Take a look at the following two statements:
If you execute the preceding two statements, you will find that the first statement does not return data, and the second statement returns one row. Why? As described in chapter 8th, like matches the entire column. If the matched text appears in the column values, like will not find it, and the corresponding line will not be returned (unless wildcard characters are used). The regexp matches within the column values, and if the matched text appears in the column values, RegExp will find it and the corresponding row is returned. This is a very important difference. So, can regexp be used to match the entire column value (and thus the same effect as like)? The answer is yes, using the ^ and $ locator (anchor), described later in this chapter.
Matching case-insensitive regular expression matching in MySQL is case insensitive (that is, uppercase and lowercase matches). For case sensitivity, you can use the binary keyword, such as where Prod_name REGEXP BINARY ' JetPack. 000 '.
9.2.2 for or matching
To search for one of two strings (either for this string, or for another string), use | as follows:
The regular expression 1000|2000 is used in the statement. | is the OR operator of the regular expression. It represents one of the matches, so both 1000 and 2000 match and return. Use | Functionally similar to using an OR statement in a SELECT statement, multiple or conditions can be incorporated into a single regular expression. More than two or conditions can give more than two or conditions. For example, ' 1000 | 2000 | 3000 ' will match 1000 or 2000 or 3000.
9.2.3 matches one of several characters
matches any single character. But what if you just want to match a specific character? You can do this by specifying a set of characters that are enclosed in [and], as follows:
Here, the regular expression [123] Ton is used. [123] Defines a set of characters, which means matching 1 or 2 or 3, so that 1 ton and 2 ton all match and return (No 3 ton). As you can see, [] is another form of an or statement. In fact, the regular expression [123]ton is the abbreviation for [1|2|3]ton], and the latter can be used. However, you need to use [] to define what the or statement looks for. To better understand this, take a look at the following example:
This is not the desired output. The two required rows were retrieved, but another 3 rows were retrieved. This is because MySQL assumes that you mean ' 1 ' or ' 2 ' or ' 3 ton '. Unless you enclose the character | In a collection, it is applied to the entire string. Character sets can also be negated, that is, they will match anything except the specified character. To negate a character set, place a ^ at the beginning of the collection. So, although [123] matches the character 1, 2, or 3, but 123 matches anything except these characters.
9.2.4 Matching Range
The collection can be used to define one or more characters to match. For example, the following collection will match the number 0 to 9:[0123456789]
To simplify this type of collection, you can use-to define a range. The following function is equivalent to the above list of numbers: [0-9]
The range is not limited to the complete set, [1-3] and [6-9] are also valid ranges. In addition, the range is not necessarily numeric, and [A-z] matches any alphabetic character. To give an example:
This uses the regular expression [1-5] Ton. [1-5] defines a range, the expression meaning matches 1 to 5, so returns 3 matching rows. Because 5 ton matches, so return. 5 ton.
9.2.5 Match Special characters
The regular expression language consists of special characters with specific meanings. We have seen., [], | and-and so on, there are some other characters. Excuse me, if you need to match these characters, what should you do? For example, if you want to find the value that contains the. character, how do you search? Take a look at the following example:
This is not the desired output, which matches any character, so each row is retrieved. In order to match Special characters, you must use \ \ as the leading. \\-means find-, \ \. Indicates lookup:
This is the desired output. \ \. Matches, so only one row is retrieved. This processing is called escaping (escaping), and all characters with special meaning within a regular expression must be escaped in this way. This includes., |, [] and other special characters that have been used so far. \ \ is also used to refer to metacharacters (characters with special meanings), as listed in table 9-1.
Match \ In order to match the backslash (\ \) character itself, you need to use \\\.
\ or \ \? Most regular expression implementations use a single backslash to escape special characters so that the characters themselves can be used. But MySQL requires two backslashes (MySQL itself interprets one, and the regular expression library interprets the other).
9.2.6 matching character class
There are matches that identify the numbers you use frequently, all alphabetic characters, or all alphanumeric characters. For easier work, a predefined character set, called the character class (character class), can be used. Table 9-2 lists the character classes and their meanings.
9.2.7 Matching Multiple instances
All regular expressions used so far have tried to match a single occurrence. If there is a match, the row is retrieved and if it does not exist, no rows are retrieved. However, there are times when you need to have more control over the number of matches. For example, you may need to look for all the numbers, no matter how many numbers are included, or you may want to look for a word and also adapt to a trailing s (if present), and so on. This can be done by repeating the meta-characters in the regular expressions listed in table 9-3.
Here are a few examples.
The regular expression \ \ ([0-9] sticks?\\) needs to be explained. \ \ (match), [0-9] matches any number (in this case 1 and 5), sticks? Match stick and sticks (s) to make s optional, because it matches 0 or 1 occurrences of any character preceding it), \ \). No, matching stick and sticks can be very difficult.
Here is another example. This time we're going to match the 4-digit numbers that are linked together:
As mentioned earlier, [:d Igit:] matches any number, thus it is a set of numbers. {4} exactly requires that the preceding character (any number) appear 4 times, so [[:d igit:]]{4} matches any 4-bit number that is connected. It is important to note that there is almost always more than one way to write a particular expression when using regular expressions. The above example can also be written as follows:
9.2.8 Locator
All the examples so far are text that matches any position in a string. To match the text entered in 7760 specific locations, you need to use the locators listed in table 9-4.
For example, what if you want to find all the products that start with a number (including the number starting with the decimal point)? Simple search [0-9\\.] (or [[:d igit:]\\.]) No, because it will find a match anywhere within the text. The workaround is to use the ^ locator as follows:
^ matches the start of the string. Therefore, ^[0-9\\.] Match them only in. or any number for the first word in the string characters. Without ^, more than 4 additional rows (those with numbers in the middle) are retrieved.
^ 's dual use ^ There are two ways to use it. In the collection (defined by [and]), it is used to negate the collection, otherwise it is used to refer to the beginning of the string.
To make a regexp similar to the role of like this chapter says that unlike the regexp, the type matches the entire string and the RegExp matches the substring. Using a locator, by starting each expression with a ^ and ending each expression with $, you can make the RegExp function the same as like.
A simple regular expression test can test a regular expression with a select without using a database table. The RegExp check always returns 0 (no match) or 1 (match). You can test your expressions with text strings of regexp and experiment with them. The corresponding syntax is as follows:
This example will obviously return 0 (because there is no number in the text hello).
9.3 Summary
This chapter introduces the basics of regular expressions and learns how to use them with the RegExp keyword in MySQL SELECT statements.
MySQL must know-9th chapter-Search with regular expressions