SQL advanced (1) -- regular functions, SQL -- Functions
When we compare and process strings, the like clause may be used at most. % Represents one or more characters, and _ represents one character. However, when processing complex strings, we find that this is far from what we need. So this low-end, not beautiful statement. Of course, we will not abandon them. After all, oracle did not give up on them. After Oracle10g, several functions and operations related to regular expressions are provided. Using these functions can greatly improve our ability and level to process strings.
The 10G version has four regular functions: regexp_like, regexp_insrt, regexp_replace, and regexp_substr.
Regexp_count is added for the 11g version.
Let's take a look at this table first. All transformations are in this table. This table is the regular heart method.
Metacharacters |
Meaning |
Example |
\ |
It indicates that the character to be matched is a special character, constant, or referenced by the latter. (Repeat the previous match later) |
\ N match the linefeed \ Match \ \ (MATCH ( \) Match) |
^ |
Match the start position of a string |
^ A matches arwen but does not match barwen. |
$ |
Match the end of a string |
En $ matches arwen. But does not match arwenb. |
* |
Match the previous character 0 or multiple times |
A * rwen can match rwen or aaarwen. |
+ |
Match the previous character once or multiple times |
A + rwen can match arwen or aarwen, but cannot match rwen. |
? |
Match the first character 0 or 1 time |
A? Rwen can match arwen or rwen, but cannot match aarwen. |
{N} |
Match the previous character EXACTLY n times, where n is an integer |
Ar {2} wen can match arrwen, but cannot match arwen or arrrwen. |
{N, m} |
Match the previous character at least n times, at most m times. {N,} indicates at least n matching times. There is no upper limit. |
Ar {1, 2} wen can match arwen, arrwen. But it does not match awen or arrrwen. |
. |
Point number, matching any single character except null and line feed |
Arw. n. can match arwen, arwin. but cannot match arween or arwn. |
(Pattern) |
In parentheses, pattern is a subregular expression that matches a specified pattern. |
In fact, parentheses are like parentheses in general language expressions. Sometimes adding some parentheses can enhance readability. For more information, see \ n. |
X | y |
Match "or" |
X | y can match x or y. |
[Abc] |
It can match any single character in abc. |
Hello [abc] can match helloa, hellob, helloc |
[A-z] |
Match any single character in the specified range |
Hell [a-z] can match hello or hellz |
[:] |
Specifies a character class that can match any character in the class |
[: Alphanum:] can match 0-9, A-Z, a-z [: Alpha:] matching characters A-Z and a-z [: Blank:] can match spaces or the tab key [: Digit:] numbers 0-9 can be matched [: Graph:] can match non-null characters [: Lower:] can match lowercase letters a-z [: Print:] is similar to [: graph:]. The difference is that [: print:] contains space characters. [: Punct:] can match punctuation marks., "", etc. [: Space:] can match all null characters [: Upper:] can match uppercase letters A-Z [: Xdigit:] It can match hexadecimal numbers 0-9, A-F, a-f |
\ N |
This is a post reference for the previous match hit, where n is a positive integer |
Arw (en) \ 1 can match arwenen. Note that \ 1 must be preceded by a subexpression with parentheses. |
First, we provide a case table regexp_li0924 for our exercises:
CREATE TABLE regexp_li0924( text VARCHAR(20));INSERT INTO regexp_li0924 VALUES ('123');INSERT INTO regexp_li0924 VALUES ('123a');INSERT INTO regexp_li0924 VALUES ('123123?');INSERT INTO regexp_li0924 VALUES ('afefwef');INSERT INTO regexp_li0924 VALUES ('?');INSERT INTO regexp_li0924 VALUES ('AFWEWE');INSERT INTO regexp_li0924 VALUES ('1_2_3');INSERT INTO regexp_li0924 VALUES ('1fe_24563');
1. REGEXP_LIKE Function
Regexp_like (srcstr, pattern [, match_option]): Check whether srcstr matches pattern and returns a Boolean value. This function also provides an optional parameter match_option string to indicate the default matching options. Match_option is selected as follows:
'C' indicates that the matching time zone is case sensitive (default );
'I' indicates that the matching is case insensitive;
'N' (.) indicates all single characters, including line breaks (I don't know where to use line breaks. I only know that chr (10) indicates line breaks in SQL.
When a 'M' string contains a line break, it is processed as multiple lines. In this way, $ can match the end of each line. Otherwise, $ will only match the last position of the string.
Example:
-- SELECT * FROM regexp_li0924 WHERE regexp_like (text, '[: digit:]') that contains numbers in the query; -- query contains numbers? SELECT * FROM regexp_li0924 WHERE regexp_like (text ,'\? '); -- Query contains 'A' SELECT * FROM regexp_li0924 WHERE regexp_like (text, 'A '); -- Query 'A' or 'A' SELECT * FROM regexp_li0924 WHERE regexp_like (text, 'A', 'I'); -- you may think of this. In fact, regular expressions are changeable. So there is a second method. SELECT * FROM regexp_li0924 WHERE regexp_like (text, '(a | )');
2. REGEXP_INSTR Function
REGEXP_INSTR (srcstr, pattern [, start [, occurrence [, return_option [, match_option]): This function has many parameters; is an extension of the basic SQL function INSTR. The Instr function returns an integer that indicates the first occurrence of the position number matching the string in a string.
For parameters, refer to the INSTR function.
I have never used this function.
Select regexp_instr (text, '[^ [: digit:]'), text from regexp_li0924; -- This is also a method for querying non-numeric characters. Starting from the string 'f2345ef4a56' with 2nd characters, find the location of the 3rd non-numeric characters in the zip file: select regexp_instr ('f2345ef456 ', '[^ [: digit:]', 2nd) from dual -- returns 7 -- starts from the string 'f2345ef4a56' with characters, find the next 2 Characters of the 3rd non-numeric characters in the zip code, select regexp_instr ('f2345ef456', '[^ [: digit:]', 1) from dual;
3. REGEXP_REPLACE Function
REGEXP_REPLACE (srcstr, pattern [, replace_string [, start [, occurrence [, match_option]); this function is also equivalent to the extension of the replace function. Returns the replaced string.
The parameter is similar to the above
-- Intercept the 123 SELECT regexp_replace ('192 _ 111111 ','(. *)_(. *) ',' \ 1') FROM dual; SELECT regexp_replace ('000000', '_ [: digit:] +', '') FROM dual; -- replace the string '123456fewfwefe34165' with the string '0' SELECT regexp_replace ('123456fewfwefe34165 ',' [^ [: digit:] ', '0') FROM dual; -- Replace the last name and the name with each other (because the English name is different from the Chinese name .) SELECT regexp_replace ('mei Han', '(. *) (. *)', '\ 2 \ 1') FROM dual;
4. REGEXP_SUBSTR Function
REGEXP_SUBSTR (x, pattern [, start [, occurrence [, match_option]) is used to find pattern in x and return it. I like this function. because it is easy to use. this is very useful in string segmentation. you can check my -- Character Segmentation in SQL.
-- Take the string MMS-2501-MO to 2501. Of course regexp_replace can be done. But I still like this SELECT regexp_substr ('mms _ 2501_mo', '[^ _] +', 1, 2) FROM dual;
5. REGEXP_COUNT Function
REGEXP_COUNT (srcstr, pattern [, match_option]) is used to return the number of matched results. oracle11g is supported.
-- For example, how many zeros does a string contain? Of course, you can count them. But do you want to count them? Of course, you do not need a regular function. If you use this function, will it be greater than SELECT regexp_substr ('20170', '0') FROM dual;
Author: li0924
Time: 2014-11-04
The copyright of this article belongs to the author. You are welcome to repost this article, but you must keep this statement without the author's consent and provide the original article connection clearly on the article page.
How to call a function in SQL
You can specify a string for function parameters.
For example
SELECT [dbo]. [w_GetProdColor] ('test ')
You can also use fields as parameters. In fact, they are used in the same way as system functions.
SELECT [dbo]. [w_GetProdColor] (column name)
FROM table name
Implement Regular Expressions in SQL
No regular expression LIKE
Determines whether the specified string matches the specified pattern. The mode can contain regular and wildcard characters. During pattern matching, regular characters must match exactly the characters specified in the string. However, any part of a string can be used to match a wildcard. And use = and! = Compared to string comparison operators, using wildcards makes the LIKE operator more flexible. If no parameter belongs to the string data type®SQL Server™It is converted to the string data type (if possible ).
Syntax
Match_expression [NOT] LIKE pattern [ESCAPE escape_character]
Parameters
Match_expression a valid SQL Server expression of any string data type.
The search mode in patternmatch_expression can contain the following valid SQL Server wildcards
1% any string containing zero or more characters
Example: WHERE title LIKE '% computer %' searches for all titles containing the word "computer" in any location of the title.
2 _ (underline) any single character
Example: WHERE au_fname LIKE '_ ean' searches for the names of all four letters ending with ean (such as Dean and Sean ).
3 [] any single character in the specified range
Example: WHERE au_lname LIKE '[C-P] arsen' will look for the author's surname that ends with arsen and starts with any single character between C and P, for example, Carsen, Larsen, Karsen, etc.
4 [^] does not belong to any single character in the specified range. It is opposite []
Example: WHERE au_lname LIKE 'de [^ l] %' searches for the last names of all authors whose names start with de and whose later letters are not l.
1. Use the like pattern matching:
We recommend that you use like.
LIKE supports ASCII and Unicode matching. When all parameters, including match_expression, pattern, and escape_character (if any) are of the ASCII character data type, an ASCII pattern match is executed. If any of these parameters belong to the Unicode data type, all parameters are converted to Unicode and matched in the Unicode mode. When LIKE is used for Unicode data (nchar or nvarchar data type), trailing spaces are meaningful. But for non-Unicode data, trailing spaces are meaningless. Unicode LIKE is compatible with SQL-92 standards. Ascii like is compatible with earlier versions of SQL Server
2. Use the % wildcard
For example, this query will show the full text in the database...>