[Switch] oracle Regular Expression and oracle Regular Expression
1. Brief Introduction to oracle (regular expression)
At present, regular expressions have been widely used in many software applications, including * nix (Linux, Unix, etc.), HP and other operating systems, PHP, C #, Java and other development environments.
Oracle 10g Regular Expressions improve SQL flexibility. Effectively solves Data Validity, repeated word recognition, irrelevant blank detection, or breaking strings composed of multiple regular expressions.
Oracle 10 Gb supports four new functions of Regular Expressions: REGEXP_LIKE, REGEXP_INSTR, REGEXP_SUBSTR, and REGEXP_REPLACE.
They use POSIX Regular Expressions instead of the old percent (%) and wildcard (_) characters.
2. oracle Regular Expression special characters
'^' Matches the start position of the input string and is used in the square brackets expression. In this case, this character set is not accepted.
'$' Matches the end position of the input string. If the Multiline attribute of the RegExp object is set, $ also matches '\ n' or' \ R '.
'.' Matches any single character except line break \ n.
'? 'Match the previous subexpression zero or once.
'*' Matches the previous subexpression zero or multiple times.
'+' Matches the previous subexpression once or multiple times.
'()' Indicates the start and end positions of a subexpression.
'[]' Indicates a bracket expression.
'{M, n}' indicates the exact number of occurrences. m = <number of occurrences <= n, '{m}' indicates m occurrences, '{m ,} 'indicates that at least m occurs.
'|' Indicates an option between the two items. Example '^ ([a-z] + | [0-9] +) $' indicates a string composed of all lowercase letters or numbers.
\ Num matches num, where num is a positive integer. References to the obtained matching.
A useful feature of a regular expression is that it can be used after the subexpression is saved. It is called Backreferencing. allows complex replacement capabilities, such as adjusting a pattern to a new location or indicating the position of a replaced character or word.
The matched sub-expressions are stored in the temporary buffer. The buffer is numbered from left to right and accessed using the \ numeric symbol.
The following example shows how to change the name aa bb cc to cc, bb, aa.
Select REGEXP_REPLACE ('aa bb CC ','(. *)(. *)(. *) ',' \ 3, \ 2, \ 1') FROM dual; REGEXP_REPLACE ('ellenhildismit cc, bb, A' \ 'escape character.
3. oracle regular character Cluster
[[: Alpha:] any letter.
[[: Digit:] any number.
[[: Alnum:] Any letter or number.
[[: Space:] any white characters.
[[: Upper:] Any uppercase letter.
[[: Lower:] Any lowercase letter.
[[Unct:] Any punctuation.
[[: Xdigit:] Any hexadecimal number, which is equivalent to [0-9a-fA-F].
4. Operation priority of various operators
\ Escape Character
(),(?, (? =), [] Parentheses and square brackets
*, + ,?, {N}, {n ,}, {n, m} qualifier
^, $, \ Anymetacharacter location and Sequence
| "Or" Operation
5. Simulated test example
-- Test Data
Create table test (mc varchar2 (60 ));
Insert into test values ('20140901 ');
Insert into test values ('2017 22113344 ');
Insert into test values ('2017 33112244 ');
Insert into test values ('2014 44112233 5566 778899 ');
Insert into test values ('2014 5511 2233 4466778899 ');
Insert into test values ('20140901 ');
Insert into test values ('20140901 ');
Insert into test values ('20140901 ');
Insert into test values ('20140901 ');
Insert into test values ('aabbccddee ');
Insert into test values ('bbaaaccddee ');
Insert into test values ('ccabbddee ');
Insert into test values ('ddaabbccee ');
Insert into test values ('eeaabbccdd ');
Insert into test values ('ab123 ');
Insert into test values ('123xy ');
Insert into test values ('007ab ');
Insert into test values ('abcxy ');
Insert into test values ('the final test is how to find duplicate words .');
Commit;
A. REGEXP_LIKE
Select * from test where regexp_like (mc, '^ a {1, 3 }');
Select * from test where regexp_like (mc, 'a {1, 3 }');
Select * from test where regexp_like (mc, '^ a. * e $ ');
Select * from test where regexp_like (mc, '^ [[: lower:] | [[: digit:]');
Select * from test where regexp_like (mc, '^ [: lower:]');
Select mc FROM test Where REGEXP_LIKE (mc, '[^ [: digit:]');
Select mc FROM test Where REGEXP_LIKE (mc, '^ [^ [: digit:]');
B. REGEXP_INSTR
Select REGEXP_INSTR (mc, '[[: digit:] $') from test;
Select REGEXP_INSTR (mc, '[[: digit:] + $') from test;
Select REGEXP_INSTR ('the price is $400. ',' \ $ [[: digit:] + ') from dual;
Select REGEXP_INSTR ('onetwothree ',' [^ [: lower:] ') from dual;
Select REGEXP_INSTR (',', '[^,] *') from dual;
Select REGEXP_INSTR (',', '[^,]') from dual;
C. REGEXP_SUBSTR
SELECT REGEXP_SUBSTR (mc, '[a-z] +') FROM test;
SELECT REGEXP_SUBSTR (mc, '[0-9] +') FROM test;
SELECT REGEXP_SUBSTR ('ababcde', '^ a. * B') FROM DUAL;
D. REGEXP_REPLACE
Select REGEXP_REPLACE ('Joe Smith ',' () {2,} ',', ') AS RX_REPLACE FROM dual;
Select REGEXP_REPLACE ('aa bb CC', '(. *)', '\ 3, \ 2, \ 1') FROM dual
6. oracle regular functions
Regexp_like.regexp_pai.regexp_substr.
They are used with oracle SQL functions like. instr. substr and replace, but they use posix Regular Expressions instead of the old percent sign (%) and wildcard character.
Regexp_like is similar to the like operator. if the first parameter matches the regular expression, it is parsed to true. for example, where regexp_like (ename, ^ j [ao], I) will return a row of data when ename starts with ja or jo. the I parameter specifies that the regular expression is case sensitive. you can also specify regexp_like in check constraints and function indexes. example: [extended information: MYSQL basic database terminology]
^ Indicates the start of the string $ indicates the end of the string. represents the range of any character. For example, [a-z] indicates any ascii lowercase letter, which is equivalent to the character class "[[: lower? Allows a single successor to match zero times or one time + allows one or more next times * Indicates zero or multiple times
You can use "{m, n}" to specify an exact range, which means "appears from m to n times "; "{m}" indicates "exactly m times", and "{m,}" indicates "at least m times ". you can also use a combination of parentheses and "|" (vertical bars) to indicate replacement. for example, the string ^ ([a-z] + | [0-9] +) $ matches all strings composed of lowercase letters or numbers.