Oracle Regular Expression

Source: Internet
Author: User

Requirements:

Match the mobile phone number. The first digit can be "+", but not "+". The following digits are all numbers, for example:

+ 861359415665

8613659558555

1356856455

Are valid.

 

+ Aa156945555

Aa1359556666

Aaddssdfdfsd

Are invalid.

Regular Expression:

SQL> select * from dual where regexp_like ('+ 333333', '^ [\ +] * [: digit:] +'); -- This + escape or not escape, the result is the same DUMMY-----X

 

SQL> SELECT * FROM DUAL WHERE regexp_like('aa333333' ,'^[+]*[[:digit:]]+');DUMMY-----

 

Explanation:

1. ^ indicates the start, * indicates the occurrence of 0 or multiple times, + indicates the occurrence of 1 or multiple times, and [: digit:] indicates the pure number of 0-9 (and $ indicates the end of what, if it is [[: digit:] + $, it indicates ending with a number ). The regular expression indicates:

It starts with + 0 or multiple times, followed by a number that appears once or multiple times (that is, a number must exist ).

2. In the dual table, there will always be only one row of records. The query results show that there are records in dual, which proves that the where condition is true, and vice versa.

An incorrect regular expression was previously written:

[\+]*[[:digit:]]+

Note that only one ^ representing the start symbol is missing. If this symbol is missing, it indicates that the regular expression means:

+ Appears 0 or multiple times (that is, + can appear, but can not appear !!), The following number appears once or multiple times. The preceding code can appear 0 times and prove that the Code does not have a plus sign (+ aa111a, aass11111 ......), This regular expression is always established, with serious errors !!

Oracle Regular Expression application by Wenzhou -- ranking

In Oracle, regular expressions are available in four functions: regexp_like, regexp_substr, regexp_instr, and regexp_replace. Here we have flexible application in Oracle 10 Gb.

First, let's briefly introduce the content of regular expressions. Regular expressions are used as text content for fast query. There are many applications in Linux. First, the start and end of a row "^" indicates that only the content at the beginning of the row is searched. The "$" character is used only to find the content at the end of the row. "^" Can also be used as an exclusion character. It is clear to use the example for a demonstration.

Here I use the regexp_like function to do this, so that we can usually use more.

Select * From test_table

Where regexp_like (field_1, '^ 1234 ')

This indicates whether there is a matching string starting with 1234. Here, the like method is the same.

 

 

Select * From test_table

Where regexp_like (field_1, '^ [12] 234 ')

Here there is an additional [] here as an independent character, which indicates that it starts with 1 or 2, and then the characters in 234 will be matched.

 

 

Select * From test_table

Where regexp_like (field_1, '^ (Ouyang | Li) ')

Here we can express that a query string named Ouyang or Li is called xiao'er. Here there is one more (), which is written as a string and exactly corresponds.

There is also a "|" to represent or.

 

 

Select * From test_table

Where regexp_like (field_1, '^ Li [small] * 2 ')

Here, we can query both Li xiao'er and Li xiao'er. here we need to talk about [] followed by a *, which indicates 0 ~ Infinitely Many characters are matched. For this [], we can also add a "+" to represent 1 ~ It can be more accurate to match infinite characters. {1, 3} after [] indicates the matching of 1 to 3 identical characters. There is another "?" 1 or 0.

 

 

Select * From test_table

Where regexp_like (field_1, 'Lee [^ small] 2 ')

Here we can query the name of Li, but the second word is not "small.

 

 

Select * From test_table

Where regexp_like (field_1, '[0-9]')

Here is a string that contains 0-9 numbers.

 

 

Select * From test_table

Where regexp_like (field_1, '[A-Z]')

Here is the query string containing lowercase letters of A-Z.

 

 

Select * From test_table

Where regexp_like (field_1, '[A-Z]')

Here is the query string that contains all letters of A-Z.

 

 

Select * From test_table

Where regexp_like (name, '[[: Alpha:]')

It indicates that the query matches any letter, including Chinese characters.

 

 

Select * From test_table

Where regexp_like (name, '[[: alnum:]')

The query matches any letter or number.

 

 

Select * From test_table

Where regexp_like (name, '[[: digit:]')

The query matches any number.

 

 

Select * From test_table

Where regexp_like (name, 'of', 'I ')

Here is of case insensitive

 

 

Select * From test_table

Where regexp_like (name, '^ [0-9] {1, 3 }. [0-9] {1, 3 }. [0-9] {1, 3 }. [0-9] {1, 3} $ ')

In this way, we can query whether the IP Format is correct.

 

 

 

Next, we will introduce regexp_substr.

This is also a very practical function.

 

The regexp_substr function is the same as the substr function. The truncated substring is returned.

Regexp_substr (srcstr, pattern [, position [, occurrence [, match_option])

Note:

Srcstr source string

Pattern regular expression style

Position starts to match the character position

Occurrence match times

Match_option matching option (case sensitive)

 

 

Select regexp_substr ('1psn/231_3253/abc', '[[: alnum:] +') from dual;

Output: 1psn

[[: Alnum:] + indicates that one or more letters or numbers are matched.

 

 

Select regexp_substr ('1psn/231_3253/abc', '[[: alnum:] +', 1, 2) from dual;

Output: 231

Compared with the preceding example, two more parameters are provided.

1 indicates searching for matching from the first character of the source string

2 indicates the string that is matched for 2nd times (the default value is "1", as shown in the preceding example)

 

 

Select regexp_substr ('@/231_3253/abc',' @ * [[: alnum:] + ') from dual;

Output: 231

@ * Indicates matching 0 or multiple @

[[: Alnum:] + indicates that one or more letters or numbers are matched.

Note: The difference between "+" and "*" must be different.

 

 

Select regexp_substr ('1 @/231_3253/abc', '@ + [[: alnum:] *') from dual;

Output :@

@ + Indicates matching one or more @

[[: Alnum:] * Indicates matching 0 or multiple letters or numbers

 

 

Select regexp_substr ('1 @/231_3253/abc', '@ + [[: alnum:] +') from dual;

Output: NULL

@ + Indicates matching one or more @

[[: Alnum:] + indicates that one or more letters or numbers are matched.

 

 

Select regexp_substr ('@ 1psn/231_3253/abc125', '[: digit:] + $') from dual;

Output: 125

[[: Digit:] + $ indicates matching the characters ending with one or more digits

 

 

Select regexp_substr ('1 @/231_3253/abc', '@ + [[: alnum:] +') from dual;

Output: NULL

@ + Indicates matching one or more @

[[: Alnum:] + indicates that one or more letters or numbers are matched.

 

 

Select regexp_substr ('@ 1psn/231_3253/abc125', '[: digit:] + $') from dual;

Output: 125

[[: Digit:] + $ indicates matching the characters ending with one or more digits

 

 

Select regexp_substr ('@ 1psn/231_3253/abc',' [^ [: digit:] + $ ') from dual;

Output:/ABC

[^ [: Digit:] + $ indicates matching one or more characters not ending with a number.

 

 

Select regexp_substr ('Tom _ Kyte@oracle.com ',' [^ @] + ') from dual;

Output: tom_kyte

[^ @] + Indicates matching one or more characters that are not "@"

 

 

Select regexp_substr ('1psn/231_3253/abc', '[[: alnum:] *', 1, 2)

From dual;

Output: NULL

[[: Alnum:] * Indicates matching 0 or multiple letters or numbers

Note: Because there are 0 or more matches, the 2nd match here is "/" (0 matches) instead of "231", so the result is "null"

 

 

Here we sometimes query the asdfafd <main> dafda in the string. here we need to retrieve the <main> string.

Select regexp_substr ('asdfafd <main> dafda ',' <[^>] +> ') from dual

Output: <main>

Here we go to the <> center with a ^> so that after the match <, we can make sure that there is no longer before the match in the backward query, otherwise there will be errors.

 

 

Select regexp_substr ('asdfafd <main> da> FDA ',' <[^ <] +> ') from dual

Output: <main> da>

In this example, we are still in Da After <main>. In this case, if we do not add ^>, the regular expression will continue to match backward, until the last one is> to, there will be deviations

 

 

This is usually used to implement the column passing of strings

Select regexp_substr ('2017; 123; 234; 345; 456; 567; 678; 789 ',' [^;] + ', 1, rownum) from dual

Connect by rownum <= length ('192; 123; 234; 345; 456; 567; 678 ')-length (replace ('192; 789; 123; 234; 345; 456; 678; 789 ','; ') + 1

Here, the length operation is to get the number of ";" first, and then use the connect by rownum method to perform a multi-row operation. After the operation is changed to multiple rows, you can use regexp_substr to retrieve strings.

 

 

Next, let's look at the previous example.

A, B, C, D, E, D, F, A, n is a string. We need to remove some duplicates in the string. In this case, the result is a, B, c, D, E, F, N removes the two strings D and.

Select wm_concat (new_row) from (

Select distinct regexp_substr ('a, B, C, D, E, D, F, A, n', '[^,] +', 1, rownum) new_row from dual

Connect by rownum <= length ('a, B, C, D, E, D, F, A, n')-length (replace ('a, B, c, d, E, D, F, A, n ',',')))

By converting to multiple rows, and then removing duplicates using distinct, We can merge the strings through wm_concat.

 

 

Let's look at another example of IP format conversion. The general IP Format is 12.19.168.27. The first three digits must be supplemented with 0, and the result is 012.019.168.027.

Select wm_concat (new_value) from (

Select

Lpad (regexp_substr ('12. 19.168.27 ',' [^.] + ', 1, rownum), 3, '0') new_value, rownum

From dual

Connect by rownum <5

Order by rownum)

 

 

To verify that the IP address is a number

Select count (*) from (

Select

Lpad (regexp_substr ('12. 19.168.27 ',' [^.] + ', 1, rownum), 3, '0') new_value, rownum

From dual

Connect by rownum <5)

Where new_value> = 0 and new_value <256

Having count (*) = 4

 

 

To convert an IP string format to a digital IP address.

Select sum (new_value * power (256, 4-rm) from (

Select regexp_substr ('12. 19.168.27 ',' [^.] + ', 1, rownum) new_value, rownum RM from dual

Connect by rownum <= 4

)

 

 

Next we will introduce a regexp_instr function.

 

The regexp_instr function uses a regular expression to return the start and end points of the search mode. The regexp_instr syntax is as follows. Regexp_instr returns an integer indicating the start or end position of the search mode. If no matching value is found, 0 is returned.

 

Syntax:

2. The regexp_instr function is the same as the instr function and returns the string position.

Regexp_instr (srcstr, pattern [, position [, occurrence [, return_option [, match_option])

Like regexp_substr, it also has the pattern, position (start position), occurrence, and match_parameter variables. Here we mainly introduce the role of the new parameter return_option, which allows users to tell Oracle, what content will be returned when the mode appears.

 

Select regexp_instr ('asdfafd <main> da> FDA ', 'sd') from dual

Output: 2

Query the SD location here, which is the same as instr

 

 

Select regexp_instr ('asdfafd <main> da> FDA ', 'da', 1, 2) from dual

This is where the second occurrence of DA is queried.

 

In addition, we often encounter a situation where to query a field, if it is equal to "Shanghai" or "Beijing" or our Wenzhou city, it is written as a big city, and other cities are written as small cities, we generally consider using decode.

 

Select decode ('shanghai', 'shanghai', 'Big City ', 'beijing', 'Big City', 'wenzhou ', 'Big City', 'small City ') from dual

There are only two sqls that we may think are not very lengthy. If there are four or five sqls, it will be a little long. Here we can use regexp_instr to perform a lot of operations.

 

Select decode (regexp_instr ('beijing', '^ (Shanghai | Beijing | Wenzhou)'), 0, 'small city', 'Big City') from dual

You can use the condition 0 when regexp_instr does not match.

 

 

 

Last function regexp_replace

The regexp_replace function uses another value to replace a value in the string. For example, you can use a matching number to replace each occurrence of a letter. The format of regexp_replace is as follows:

 

Syntax:

4. The regexp_replace and replace functions are the same. Replace the character content in the original string.

Regexp_replace (srcstr, pattern [, replacestr [, position [, occurrence [, match_option])

 

 

This replacement function is very useful.

For example, if we have a string adfadfa (main) Next, we need to replace () with <>. Here we may want to replace it, but what we do now is (must be available later). We will replace <>.

Select regexp_replace ('adfadfa (main) next', '(\ () ([^ \)] *) (\)', '<\ 2>') from dual

Output: adfadfa <main> next

Here is an escape character.

 

Let's look at another example of IP format conversion. The general IP Format is 12.19.168.27. The first three digits must be supplemented with 0, and the result is 012.019.168.027.

Select regexp_replace (

Regexp_replace ('12. 19.168.27 ',' ([0-9] {1, 3 }). ([0-9] {1, 3 }). ([0-9] {1, 3 }). ([0-9] {1, 3 })',

'00 \ 1.00 \ 2.00 \ 3.00 \ 4 '),

'([0-9] *) ([0-9] {3 }\.) ([0-9] *) ([0-9] {3 }\.) ([0-9] *) ([0-9] {3 }\.) ([0-9] *) ([0-9] {3} $) ',' \ 2 \ 4 \ 6 \ 8 ')

From dual

Output: 012.019.168.027

Here I will divide the operation into two steps: regexp_replace ('12. 19.168.27 ',' ([0-9] {1, 3 }). ([0-9] {1, 3 }). ([0-9] {1, 3 }). ([0-9] {1, 3 })',

'00 \ 1.00 \ 2.00 \ 3.00 \ 4') First, add 0 to each small string so that each string will be greater than 3.

'([0-9] *) ([0-9] {3 }\.) ([0-9] *) ([0-9] {3 }\.) ([0-9] *) ([0-9] {3 }\.) ([0-9] *) ([0-9] {3} $) ',' \ 2 \ 4 \ 6 \ 8 ')

This entire string is divided into eight segments, so that we only need the four segments 2, 4, 6, and 8.

 

 

In the following example, a space character is inserted between every two characters.

Select regexp_replace ('yahoo ',' (.) ',' \ 1') as output from dual;

Output: y a H O

This is a good way to operate in a loop.

 

 

Select regexp_replace (

Regexp_replace ('12. 19.168.27 ',' ([^.] + )'

, '00 \ 1 ')

, '([^.] *) ([^.] {3})', '\ 2 ')

From dual

Then, we can replace the replace loop.

 

2010-7-5

Reposted from: Wenzhou -- ranking

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.