2014-11-30 Baoxinjian
I. Summary
There are four main functions in Oracle that support regular expressions:
1 Regexp_like: Similar to the function of like
2 Regexp_instr: Similar to the function of INSTR
3 Regexp_substr: Similar to the function of SUBSTR
4 Regexp_replace: Similar to the function of REPLACE
They are used in the same way as Oracle SQL functions like, INSTR, SUBSTR, and replace usages.
Second, call format
1. Format:
regexp_instr (source_string, pattern [, Position [, Occurrence [, return_option ] ] ] ] )
2. Parameter Description:
(1). source_string: Input string
(2). Pattern: Regular Expression
(3). Position: Identifies a regular expression match starting from the first few characters. (Default is 1)
(4). Occurrence: Identifies the first few matching groups. (Default is 1)
(5). return_option:0--returns the position where the first character appears. 1:pattern the starting position of the next character.
(6). Match_parameter: Range of values
- I: case is not sensitive;
- C: Case sensitive;
- N: Point number. Do not match line break symbols;
- M: multi-line mode;
- X: Extended mode, ignoring white-space characters in regular expressions.
Three, meta characters and character clusters
1. Special characters
' ^ ' matches the starting position of the input string, which is used in a square bracket expression, which indicates that the character set is not accepted.
' $ ' matches the end position of the input string. If the Multiline property of the RegExp object is set, then $ also matches ' \ n ' or ' \ R '.
'. ' matches any single character except for the newline character \ n.
'? ' matches the preceding subexpression 0 or one time.
' * ' matches the preceding subexpression 0 or more times.
' + ' matches the preceding sub-expression one or more times.
' () ' marks the start and end position of a subexpression.
' [] ' marks a bracket expression.
' {m,n} ' an exact occurrence of the frequency range,m=< occurrences <=n, ' {m} ' indicates the presence of M times, ' {m,} ' indicates at least m times.
' | ' indicates a choice between the two items. Example ' ^ ([a-z]+|[ 0-9]+) $ ' represents all lowercase letters or numbers combined into a string.
\num matches num, where num is a positive integer. A reference to the obtained match.
The ' \ ' escape character.
2. Character clusters
[[: Alpha:]] any letter.
[[:d Igit:]] any number.
[[: Alnum:]] Any letters and numbers.
[[: Space:]] any whitespace character.
[[: Upper:]] any uppercase letters.
[[: Lower:]] any lowercase letter.
[[UNCT:]] any punctuation.
[[: Xdigit:]] Any 16 binary number, equivalent to [0-9a-fa-f].
3. Operation priority for various operators
\ escape Character
(), (?, (? =), [] parentheses and square brackets
*, +,?, {n}, {n,}, {n,m} qualifier
^, $, \anymetacharacter position and order
| "or" action
Iv. Cases
1. Create test data
CREATE TABLETest (MCVARCHAR2( -));BEGIN INSERT intoTestVALUES('112233445566778899'); INSERT intoTestVALUES('22113344 5566778899'); INSERT intoTestVALUES('33112244 5566778899'); INSERT intoTestVALUES('44112233 5566 778899'); INSERT intoTestVALUES('5511 2233 4466778899'); INSERT intoTestVALUES('661122334455778899'); INSERT intoTestVALUES('771122334455668899'); INSERT intoTestVALUES('881122334455667799'); INSERT intoTestVALUES('991122334455667788'); INSERT intoTestVALUES('Aabbccddee'); INSERT intoTestVALUES('Bbaaaccddee'); INSERT intoTestVALUES('Ccabbddee'); INSERT intoTestVALUES('Ddaabbccee'); INSERT intoTestVALUES('EEAABBCCDD'); INSERT intoTestVALUES('ab123'); INSERT intoTestVALUES('123xy'); INSERT intoTestVALUES('007ab'); INSERT intoTestVALUES('Abcxy'); INSERT intoTestVALUES('The final test is an is what to find duplicate words.'); COMMIT;END;
2. Case Regexp_like
{} means a couple of consecutive matches above the SQL meaning that it starts with 1 to 3 consecutive a
There are 3 consecutive a
The character ending with a begins with a. Because if it's written ^a*e$, it's a word that ends with an e in the middle of a, Vlasov * means matching the characters in front of it.
^ab*e$ can you find out the characters that end with E at the beginning of a?
The answer is no, because * although it can match the previous b0 or multiple times, it's just a match.
With the beginning of a the middle can have B and both are B, or in the middle there are no characters, with E end characters.
' ^[[:alpha:]]+$ ' means a character that contains letters from the beginning to the end of a letter.
Characters beginning with lowercase letters or numbers
Characters beginning with lowercase letters
"^" is a multi-meaning character element, mainly to see the semantic environment if "^" is the first character of the word columns, it means that the string is reversed, therefore, [^[:d Igit:]] is to find a non-numeric pattern is not always a number of characters
Characters that begin with a non-digit
3. Case Regexp_instr
4. Case Regexp_substr
5. Case Regexp_replace
Reference: http://blog.sina.com.cn/s/blog_56cacf83010005aw.html
Thanks and regards
Plsql_ Basic Series 1_ Regular Expression Regexp_like/substr/instr/replace