Address: http://www.cnblogs.com/Azhu/archive/2012/04/03/2431127.html
Copy from the Oracle Database 10gsql Development Guide.
Regular Expressions:
This section describes regular expressions and related Oracle database functions. Use these functions to search for character patterns in a string. For example, suppose you have the following years:
1965
1968
1971
1970
If you want to get the year between 1965 and 1968 (including 1965 and 1968), you can use the following regular expression to implement this functionality:
^196[5-8]$
The regular expression contains many metacharacters (metacharacter). In the above example, ^, [5-8], and $ are metacharacters. ^ can match the beginning of a string; [5-8] can match a number between 5~8; $ can match the end of a string. Therefore, ^196 can match strings that begin with 196, and [5-8]$ can match strings that end with 5, 6, 7, or 8. and ^196[5-8]$ can match 1965, 1966, 1967, and 1968, which is the desired result.
In the following example, suppose you have the following string, which refers to a line in Shakespeare's Romeo and Juliet:
But, soft! What light through yonder window breaks?
If you want to find the substring light, you can apply the following regular expression to the referenced string:
L[[:ALPHA:]]{4}
In this example, [[: Alpha:]] and {4} are metacharacters. [[: Alpha:]] can match a A-Z or a-Z character; {4} indicates that the preceding match pattern can be repeated 4 times. When l, [[: Alpha:]] and {4} are used together, you can match a sequence of 5 letters starting with L. Therefore, when the regular expression l[[:alpha:]]{4} is applied to the string, the substring light can be matched.
Table 4-7 lists the partial metacharacters that can be used in regular expressions, as well as the meanings of these metacharacters, and a simple example of using these meta characters.
Table 4-7 meta-characters in regular expressions
Metacharacters |
Mean |
Example |
\ |
Indicates that the character to match is a special character, constant, or inverse reference. (The reverse reference repeats the last match.) ) |
\ nthe match line break \ \ match \ \ (Match ( \) match) |
^ |
Match the starting position of the string |
If A is the first character in a string, ^a matches a |
$ |
Match the end position of the string |
If B is the last character in the string, $B match b |
* |
Matches the preceding character 0 or more times. |
Ba*rk can match brk, bark, baark, etc. |
+ |
Matches the preceding character 1 or more times. |
Ba+rk can match bark, baark, and so on, but cannot match BRK |
? |
Match the preceding character 0 or 1 times |
Ba?rk can only match brk and bark |
N |
Matches a character exactly n times, where n is an integer |
Hob{2}it can match Hobbit |
(Continuation of table)
Metacharacters |
Mean |
Example |
{N,m} |
Matches a character at least n times, up to M times, where N and M are integers |
Hob{2,3}it can only match Hobbit and Hobbbit |
. |
Match any single character except NULL |
Hob.it can match hobait, Hobbit, etc. |
(pattern) |
Matches a sub-expression of the specified pattern. You can use sub-expressions to form complex regular expressions. In this seed expression, you can access a single match, called Capture |
Anatom (y|ies) can match anatomy and anatomies |
X|y |
Match x or Y, where x and Y are one or more characters |
War|peace can match war or peace |
[ABC] |
Match any one of the characters within the brackets |
[AB]BC can match ABC and BBC |
[A-z] |
Matches any one character in the specified range |
[A-C]BC can match ABC, BBC and CBC |
[: :] |
Specifies a character class that can match any character in the class |
[: Alphanum:] can match characters 0-9, A-Z and a-Z [: Alpha:] can match characters A-Z and a-Z [: Blank:] can match space or TAB key [:d Igit:] can match the number 0-9 [: Graph:] can match non-null characters [: Lower:] can match lowercase letter A-Z [:p rint:] Similar to [: graph:], except that [:p rint:] include space characters [:p UNCT:] can match punctuation., "" and so on. [: space:] can match all whitespace characters [: Upper:] can match all uppercase letters A~Z [: Xdigit:] can match hexadecimal digits 0~9, a~f, and A~f |
[..] |
Matches a combined element, such as a multi-character element |
No |
[==] |
Specifying equivalence classes |
No |
\ n |
This is a reverse reference to the previous capture, where n is a positive integer |
(.) The \1 can match two consecutive identical characters. (.) You can match any single character except NULL, and \1 repeats the last match, matching the same character again, so you can match two consecutive identical characters |
|
|
|
|
Oracle Database10grelease 2 Adds a lot of Perl-like metacharacters, as shown in table 4-8.
Table 4-8 meta-characters similar to Perl
Metacharacters |
Meaning |
\d |
numeric characters |
\d |
Non-numeric characters |
\w |
Alphabetic characters |
\w |
Non-alphabetic characters |
\s |
White space characters |
\s |
Non-whitespace characters |
(Continuation of table)
Metacharacters |
Meaning |
\a |
Matches only the beginning of a string |
\z |
Matches the position of the end of the string only before the line break at the end of the string |
*? |
Matches the preceding pattern element 0 or more times |
+? |
Matches the preceding pattern element 1 or more times |
?? |
Matches the preceding pattern element 0 or 1 times |
N |
Matches the preceding pattern element exactly n times |
{N,} |
Matches the preceding pattern element at least n times |
{N,m} |
Matches the preceding pattern element at least n times, but not more than m times |
Table 4-9 lists the regular expression functions. The regular expression functions are newly added in Oracle database10g, and additional entries are added in Oracle database11g, as shown in the following table.
Table 4-9 Regular Expression functions
function |
Description |
Regexp_like (x, pattern [, Match_option]) |
Searches for regular expressions defined in the pattern parameter from X. You can use Match_option to modify the default matching options, which can be set to: • ' C ', indicating case sensitivity when matching (default option) • ' I ', which indicates that the case is not case-sensitive when matching • ' n ', allows the use of operators that can match any character • ' m ', use X as a string containing multiple lines |
REGEXP_INSTR (x, pattern [, Start [, Occurrence [, Return_option [, Match_option [, Subexp_option]]]) |
Finds pattern in X and returns the position where the pattern is located. You can specify the following optional parameters: start the location to start looking for. The default value is 1, which refers to the first character of X. The occurrence description should return the position of the first occurrence of pattern. The default value is 1, which means that the function returns the position of the first occurrence of pattern in X. return_option indicates what integer should be returned. If this argument is 0, the integer to be returned is the position of the first character in X, and if the argument is a non-0 integer, the integer to be returned is the position of the character that appears after pattern in X match_option modifies the default matching settings, which work the same way as specified in Regexp_likek (). Subexp_option is a new addition to Oracle Database 11g that works as follows: For patterns with sub-expressions, Subexp_option is a non-negative number between 0~9, indicating which sub-expression in pattern is the target of the function. For example, consider an expression 0123 (((ABC) (DE) f) ghi) 45 (678), which has 5 sub-expressions: "Abcdefghi", "abcdef", "abc", "De", and "678". If Subexp_option is 0, the position of the pattern is returned. If pattern does not have the correct subexpression number, the function returns 0. Subexp_option null value returns NULL. The default value for Subexp_option is 0 |
(Continuation of table)
function |
Description |
Regexp_replace (x, pattern [, Replace_string [, Start [, Occurrence [, Match_option]]]) |
Look for pattern in X and replace it with replace_string. The other options mean exactly the same as the parameters of the REGEXP_INSTR () function |
REGEXP_SUBSTR (x, pattern [, Start [, Occurrence [, Match_option [, Subexp_option]]]) |
Returns a substring in X that can match pattern, whose starting position is specified by start. The other options mean exactly the same as the parameters of the REGEXP_INSTR () function. Oracle database11g New added subexp_option works the same way as in the REGEXP_INSTR () function |
Regexp_count (x, pattern [, Start [, Match_option]]) |
This is a newly added function of Oracle database11g. Finds pattern in X and returns the number of occurrences of pattern in X. The following two optional parameters are available: start the location to start looking for. The default value is 1, which refers to the first character of X. match_option modifies the default matching settings, which works the same way as in Regexp_likek () |
The next few sections will cover more knowledge about regular expression functions.
1. Regexp_like ()
Regexp_like (x, pattern [, match_option]) is used to find the regular expression defined in the pattern parameter in X, which can also provide an optional parameter match_option, which can be set to one of the following characters:
- ' C ', indicating case sensitivity when matching (default option)
- ' I ', description is not case-sensitive when matching
- ' n ', allows the use of operators that can match any character
- ' m ', use X as a string containing multiple lines
The following query uses the Regexp_like function to retrieve birthdays between 1965 and 1968 customers:
SELECT customer_id, first_name, last_name, DOB
From customers
WHERE regexp_like (To_char (DOB, ' YYYY '), ' ^196[5-8]$ ');
customer_id first_name last_name DOB
----------- ---------- ---------- ---------
1 John Brown 01-jan-65
2 Cynthia Green 05-feb-68
The following query retrieves a customer whose name begins with J or J. Note that the regular expression passed to Regexp_like () is ^j, and the match option is I, which means case insensitive, so in this case, ^j can match J or J:
SELECT customer_id, first_name, last_name, DOB
From customers
WHERE regexp_like (first_name, ' ^j ', ' I ');
customer_id first_name last_name DOB
----------- ---------- ---------- ---------
1 John Brown 01-jan-65
2. REGEXP_INSTR ()
REGEXP_INSTR (x, pattern [, start [, occurrence [, return_option [, Match_option]]]) is used to find Pattern;regexp_instr () in X Returns the position where pattern appears. The match position starts at 1.
The following query uses the REGEXP_INSTR function to return a location that matches the regular expression l[[:alpha:]]{4}:
SELECT
Regexp_instr (' But, soft! What light through yonder window breaks? ',
' L[[:alpha:]]{4} ') as result
from dual;
RESULT
----------
17
Note that the return value is 17, which is the position of L in light.
The following query returns the position of the second match for the regular expression s[[:alpha:]]{3}, where the matching position starts at 1:
SELECT
Regexp_instr (' But, soft! What light through yonder window softly breaks? ',
' S[[:alpha:]]{3} ', 1, 2) as result
from dual;
RESULT
----------
45
The following query uses the REGEXP_INSTR function to return the position of the second matching letter O, where the match position starts at 10:
SELECT
Regexp_instr (' But, soft! What light through yonder window breaks? ',
' O ', ten, 2) as result
from dual;
RESULT
----------
32
3. Regexp_replace ()
Regexp_replace (x, pattern [, replace_string [, start [, occurrence[, Match_option]]) is used to find pattern in X and replace it with Replace_ String
The following query uses the Regexp_replace function to replace substrings that match the regular expression l[[:alpha:]]{4} with the string sound:
SELECT
Regexp_replace (' But, soft! What light through yonder window breaks? ',
' L[[:alpha:]]{4} ', ' sound ') as result
from dual;
RESULT
---------------------------------------------------
But, soft! What is sound through yonder window breaks?
Note that light has been replaced with sound.
4. REGEXP_SUBSTR ()
REGEXP_SUBSTR (x, pattern[, start [, occurrence[, Match_option]]) is used to find substrings in X that match pattern, where start is specified by start.
The following query uses the REGEXP_SUBSTR function to return substrings that match the regular expression l[[:alpha:]]{4}:
SELECT
Regexp_substr (' But, soft! What light through yonder window breaks? ',
' L[[:alpha:]]{4} ') as result
from dual;
Resul
-----
Light
5. Regexp_count ()
Regexp_count () is a newly added function of Oracle database11g. Regexp_count (x, pattern[, start [, Match_option]]) is used to find pattern in X and returns the number of occurrences of pattern in X. You can provide an optional parameter, start, that indicates the character to look for in the pattern from X, or you can provide an optional match_option string that indicates the matching option.
The following query uses the Regexp_count function to return the number of occurrences of the regular expression s[[:alpha:]]{3}:
SELECT
Regexp_count (' But, soft! What light through yonder window softly breaks? ',
' S[[:alpha:]]{3} ') as result
from dual;
RESULT
----------
2
Note that the return result is 2, which indicates that the regular expression has two matches in the supplied string.
[Go] Regular expression Oracle