REGEXP_SUBSTR extends the functionality of the SUBSTR function, allowing you to search for a regular expression pattern string.
This is also similar to REGEXP_INSTR, but returns the position of the substring, which returns the substring itself.
Grammar
The syntax for the REGEXP_SUBSTR function in an Oracle database is:
REGEXP_SUBSTR(source_char, pattern [, position [, occurrence [, match_parameter ]]])
Parameters
Source_char
The search string. Can be any data type Char,varchar2,nchar,clob,nclob type
pattern
Regular expressions
Value |
Description |
^ |
Matches the beginning of a string. If used with a match_parameter of ' m ', it matches the start of a line anywhere within expression. |
$ |
Matches the end of a string. If used with a match_parameter of ' m ', it matches the end of a line anywhere within expression. |
* |
Match 0 or more. |
+ |
Match one or more occurrences. |
? |
Match 0 or one occurrence. |
. |
Matches any character, except NULL. |
| |
Used like a "OR" to specify more than one alternative. |
[ ] |
Used to specify a matching list where is trying to match any one of the characters in the list. |
[^ ] |
Used to specify a nonmatching list where is trying to match any character except for the ones in the list. |
( ) |
Used to group expressions as a subexpression. |
{m} |
Matches m times. |
{m,} |
Matches at least m times. |
{M,n} |
Matches at least m times, but no more than n times. |
\ n |
N is a number between 1 and 9. Matches the nth subexpression found within () before encountering \ n. |
[..] |
Matches one collation element that can is more than one character. |
[::] |
Matches character classes. |
[==] |
Matches equivalence classes. |
\d |
Matches a numeric character. |
\d |
Matches a non-numeric character. |
\w |
Matches any word character that includes an underscore. |
\w |
Matches any non-word character. |
\s |
Matches any whitespace character, including spaces, tabs, page breaks, and so on. |
\s |
Matches any non-whitespace character. |
\a |
Matches the beginning of a string or Matches at the end of a string before a newline character. |
\z |
Matches at the end of a string. |
*? |
Matches the preceding pattern zero or more occurrences. |
+? |
Matches the preceding pattern one or more occurrences. |
?? |
Matches the preceding pattern zero or one occurrence. |
{n}? |
Matches the preceding pattern n times. |
{N,}? |
Matches the preceding pattern at least n times. |
{n,m}? |
Matches the preceding pattern at least n times, and not more than m times. |
position
Optional. The start position of the search in the string. If omitted, the default is 1, which is the first position of a string.
Occurrence
Optional. It is the nth matching position in a pattern string. If omitted, the default is 1.
Match_parameter
Optional. It allows you to modify the behavior of regexp_substr function matching. It can be a combination of the following:
Value |
Description |
C |
Case-sensitive matching. |
I |
Case-insensitive matching. |
N |
Allows the period character (.) to match the newline character. By default, the period is a wildcard. |
' m ' |
expression is assumed to has multiple lines, where ^ is the start of a line and $ are the end of a line, Regardle SS of the position of those characters in expression. By default, expression is assumed-be-a single line. |
' X ' |
whitespace characters is ignored. By default, whitespace characters is matched like any other character. |
Examples
Find the numbers that match
SELECT REGEXP_SUBSTR (‘hello my phone is 520 ‘, ‘[0-9]+‘) FROM dual;--520
The following example returns the specified third occurrence of the character
SELECT REGEXP_SUBSTR (‘i like beijing tiananmen‘, ‘(\S*)(\s)‘, 1, 3)FROM dual; --beijing
Itmyhome
Simple usage of Oracle's REGEXP_SUBSTR function