PLSQL _ basic Series 1 _ Regular Expression REGEXP_LIKE/SUBSTR/INSTR/REPLACE, plsqlregexp_like
2014-11-30 BaoXinjian
I. Summary
There are four functions that support regular expressions in ORACLE:
1 REGEXP_LIKE: similar to LIKE
2 REGEXP_INSTR: similar to INSTR
3 REGEXP_SUBSTR: similar to SUBSTR
4 REGEXP_REPLACE: similar to REPLACE
They are used in the same way as Oracle SQL functions LIKE, INSTR, SUBSTR, and REPLACE.
Ii. Call format
1. format:
REGEXP_INSTR (source_string, pattern [, position [, occurrence [, return_option [, match_parameter ] ] ] ] )
2. parameter description:
(1). source_string: input string
(2). pattern: Regular Expression
(3). position: Identify the regular expression matching starting from the character. (1 by default)
(4). occurrence: identifies the matching groups. (1 by default)
(5). return_option: 0 -- returns the position where the first character appears. 1: Start position of the next character in pattern.
(6). match_parameter: Value Range
- I: case insensitive;
- C: case sensitive;
- N: The dot. The line break symbol does not match;
- M: multiline mode;
- X: Extended Mode. Ignore the blank characters in the regular expression.
2. REGEXP_LIKE/REGEXP_REPLACE/REGEXP_SUBSTR/REGEXP_COUNT format
3. metacharacters and character Clusters
1. special characters
'^' Matches the start position of the input string and is used in the square brackets expression. In this case, this character set is not accepted.
'$' Matches the end position of the input string. If the Multiline attribute of the RegExp object is set, $ also matches '\ n' or' \ R '.
'.' Matches any single character except line break \ n.
'? 'Match the previous subexpression zero or once.
'*' Matches the previous subexpression zero or multiple times.
'+' Matches the previous subexpression once or multiple times.
'()' Indicates the start and end positions of a subexpression.
'[]' Indicates a bracket expression.
'{M, n}' indicates the exact number of occurrences. m = <number of occurrences <= n, '{m}' indicates m occurrences, '{m ,} 'indicates that at least m occurs.
'|' Indicates an option between the two items. Example '^ ([a-z] + | [0-9] +) $' indicates a string composed of all lowercase letters or numbers.
\ Num matches num, where num is a positive integer. References to the obtained matching.
'\' Escape character.
2. Character Clusters
[[: Alpha:] any letter.
[[: Digit:] any number.
[[: Alnum:] Any letter or number.
[[: Space:] any white characters.
[[: Upper:] Any uppercase letter.
[[: Lower:] Any lowercase letter.
[[Unct:] Any punctuation.
[[: Xdigit:] Any hexadecimal number, which is equivalent to [0-9a-fA-F].
3. Operation priority of various operators
\ Escape Character
(),(?, (? =), [] Parentheses and square brackets
*, + ,?, {N}, {n ,}, {n, m} qualifier
^, $, \ Anymetacharacter location and Sequence
| "Or" Operation
Iv. Simple cases
1. Create Test Data
CREATE TABLE test (mc VARCHAR2 (60));BEGIN INSERT INTO test VALUES ('112233445566778899'); INSERT INTO test VALUES ('22113344 5566778899'); INSERT INTO test VALUES ('33112244 5566778899'); INSERT INTO test VALUES ('44112233 5566 778899'); INSERT INTO test VALUES ('5511 2233 4466778899'); INSERT INTO test VALUES ('661122334455778899'); INSERT INTO test VALUES ('771122334455668899'); INSERT INTO test VALUES ('881122334455667799'); INSERT INTO test VALUES ('991122334455667788'); INSERT INTO test VALUES ('aabbccddee'); INSERT INTO test VALUES ('bbaaaccddee'); INSERT INTO test VALUES ('ccabbddee'); INSERT INTO test VALUES ('ddaabbccee'); INSERT INTO test VALUES ('eeaabbccdd'); INSERT INTO test VALUES ('ab123'); INSERT INTO test VALUES ('123xy'); INSERT INTO test VALUES ('007ab'); INSERT INTO test VALUES ('abcxy'); INSERT INTO test VALUES ('The final test is is is how to find duplicate words.'); COMMIT;END;
2. Case study: REGEXP_LIKE
(1 ).
(2). There are three consecutive
(3 ). the reason why the characters starting with a and ending with e are. it is because if it is written as ^ a * e $, it turns into a starting with a. In the middle, it is a character ending with e. Pulling * means matching the character before it.
^ AB * e $ can I find the characters starting with a and ending with e?
The answer is no, because * although it can match the previous b0 times or multiple times, it only matches
It can start with a and have B in the middle, or have no characters in the middle. It can end with e.
'^ [[: Alpha:] + $' indicates a letter that contains letters from the beginning to the end.
(4) characters starting with lowercase letters or numbers
(5) characters starting with lowercase letters
(6 ). "^" is a metacharacter with multiple meanings. It mainly depends on the semantic environment. If "^" is the first character in the character column, it indicates that the character string is reversed. Therefore, [^ [: digit:] indicates a non-numeric search mode. That is, the characters are not all numbers.
(7). A character starting with a non-Number
3. Case study: REGEXP_INSTR
4. Case study: REGEXP_SUBSTR
5. Case study: REGEXP_REPLACE
V. Complex Cases
1. REGEXP_SUBSTR matches a part of the specified string with the regular expression.
Example 1:
SQL> select regexp_substr('The zip code 80831 is for falcon, co','[[:digit:]]{5}' ) REGEXP_SUBSTR from dual;
REGEXP_SUBSTR-------------80831
Example 2:
SQL> select regexp_substr('The zip code 80831 is for falcon, co', '[[:alpha:]]{3,}', 1, 3) REGEXP_SUBSTR from dual;REGEXP_SUBSTR-------------code
Example 3
SQL> select regexp_substr('comments or questions - email feedback@plsqlbook.com', '[[:alnum:]](([_\.\-\+]?[[:alnum:]]+)*)@'
|| '([[:alnum:]]+)(([\.-]?[[:alnum:]]+)*)\.([[:alpha:]]{2,})') REGEXP_SUBSTR from dual;REGEXP_SUBSTR----------------------feedback@plsqlbook.com
2. REGEXP_INSTR returns the character matching the regular expression and the position of the string.
SQL> select regexp_instr('The zip code 80831 is for falcon, co', '[[:digit:]]{5}') REGEXP_INSTR from dual;REGEXP_INSTR------------14
3. The REGEXP_REPLACE function is similar to the REPLACE function. It provides a method to modify the string matching the given regular expression. This includes correcting spelling mistakes and formatting input and output text.
For example, the phone number format is 719-111-1111. The returned value of REGEX_REPLACER is:
SQL> select regexp_replace('Reformat the phone number 719-111-1111 ...', '[1]?[-.]?(\(?[[:digit:]]{3}\)?)+[- .]?'
|| '([[:digit:]]{3})[- .]?([[:digit:]]{4})', ' (\1) \2-\3') regexp_replace from dual;REGEXP_REPLACE---------------------------------------------Reformat the phone number (719) 111-1111 ...
4. The REGEXP_LIKE operator is similar to the LIKE operator, but it is more powerful, because it supports matching with this regular expression and text.
SQL> select ename, job from emp where regexp_like(job, '(clerk|analyst)', 'i');ENAME JOB---------- ---------SMITH CLERKSCOTT ANALYSTADAMS CLERKJAMES CLERKFORD ANALYSTMILLER CLERK
Thanks and Regards
Reference: http://blog.sina.com.cn/s/blog_56cacf83010005aw.html