Regexp_replace function ____ function

Source: Internet
Author: User
Tags character classes control characters posix printable characters

Regexp_replace function

Let's first look at the traditional replace SQL function, which replaces a string with another string. Suppose your data has unnecessary spaces in the body, and you want to replace them with a single space. With the Replace function, you need to accurately list how many spaces you want to replace. However, the number of extra spaces may not be the same throughout the body. The following example has three spaces between Joe and Smith. The parameters of the Replace function specify that you want to replace two spaces with a single space. In this case, the result leaves an extra space between Joe and Smith in the original string.

SELECT REPLACE (' Joe Smith ', ', ', ')
As replace
From dual
REPLACE
---------
Joe Smith

The Regexp_replace function advances the substitution function a step further, and its syntax is listed in Table 9. The following query replaces any two or more spaces with a single space. () A subexpression contains a single space that can be repeated two or more times as instructed by {2.}

Table 9:regexp_replace function

Grammar

Description

Regexp_replace (source_string, pattern
[, replace_string [, Position
[, occurrence, [match_parameter]]]

The function replaces the matching pattern with a specified replace_string, allowing for complex "search and replace" operations.


SELECT regexp_replace (' Joe Smith ',
' () {2,} ', ')
As Rx_replace
From dual
Rx_replace
----------
Joe Smith

Back reference

A useful feature of regular expressions is the ability to store the subexpression for later reuse, which is also referred to as a back reference (outlined in table 10). It allows for complex substitution functions, such as swapping patterns in new locations or displaying repeated occurrences of words or letters. The matching part of the subexpression is saved in the temporary buffer. The buffer is numbered from left to right and accessed using the \digit symbol, where digit is a number between 1 and 9 that matches the digit subexpression, which is displayed with a set of parentheses.

The following example shows the conversion of the name Ellen Hildi Smith to Smith, Ellen Hildi by reference to each subexpression by number.

SELECT Regexp_replace (
' Ellen hildi Smith ',
' (. *) (. *) (. *) ', ' \3, \1 \2 ')
From dual
Regexp_replace (' EL
------------------
Smith, Ellen Hildi

The SQL statement shows the three separate subexpression enclosed in parentheses. Each individual subexpression contains a matching meta character (.) and follows the * metacharacters, indicating that any character (other than a newline character) must match 0 or more times. A space separates each subexpression, and a space must also match. Parentheses create a subexpression that gets the value, and can be referenced with \digit. The first subexpression is assigned to \1, the second \2, and so on. These back references are used in the last parameter of the function (\3, \1 \2), which effectively returns the replacement substring and arranges them in the desired format (including commas and spaces). Table 11 details the various components of the regular expression.

A back reference is useful for replacing, formatting, and substituting values, and you can use them to find adjacent values. The following example shows the use of the REGEP_SUBSTR function to find any repeated alphanumeric values that are separated by spaces. The displayed results give a substring that identifies the repeated occurrence of the word is.

SELECT Regexp_substr (
' The final test is the implementation ',
([[: alnum:]]+) ([[: space:]]+) \1 ') as substr
From dual
SUBSTR
------
IS is

Matching parameter options

You may have noticed that the regular expression operators and functions contain an optional matching parameter. This parameter controls whether case sensitive, line break matching, and retention of multiple lines of input.

The practical application of regular expressions

Not only can you use regular expressions in queues, you can also use regular expressions anywhere you use SQL operators or functions, such as in Pl/sql languages. You can write triggers that take advantage of regular expression functionality to validate, generate, or extract values.

The next example shows how you can apply the Regexp_like operator to data validation in a column check constraint. It verifies the correct social Security number format when inserting or updating. Social Security numbers in formats such as 123-45-6789 and 123456789 for this column constraint is an acceptable value. Valid data must start with three digits, followed by a hyphen, plus two digits and a hyphen, and finally four digits. The other expression allows only 9 consecutive digits. The vertical bar symbol (|) separates the options.

ALTER TABLE Students

ADD CONSTRAINT Stud_ssn_ck CHECK

(Regexp_like (SSN,

' ^ ([[:d igit:]]{3}-[[:d igit:]]{2}-[[:d igit:]]{4}| [[:d Igit:]] {9}) $'))

The characters at the beginning or end of the ^ and $ indicate are not acceptable. Make sure that your regular expression is not divided into multiple lines or contains any unnecessary spaces unless you want the format to be so and match accordingly. Table 12 illustrates the various components of the regular expression sample.

Compare regular expressions to existing functionality

Regular expressions have several advantages over common like operators and INSTR, SUBSTR, and REPLACE functions. These traditional SQL functions are not convenient for pattern matching. Only the like operator matches by using the% and _ characters but like does not support the repetition of expressions, complex substitutions, character ranges, word characters and POSIX character classes, and so on. In addition, the new regular expression function allows you to detect repeated occurrences of word and pattern swapping. The examples here give you an overview of the regular expression areas and how you can use them in your applications.

To actually enrich your toolkit

Because regular expressions help solve complex problems, they are very powerful. Some of the features of regular expressions are difficult to emulate with traditional SQL functions. When you understand the underlying building blocks of this slightly cryptic language, regular expressions will become an integral part of your toolkit (not only in SQL but also in other programming locales). In order for your patterns to be correct, while attempts and errors are sometimes necessary, the simplicity and strength of regular expressions are unquestionable.

Alice Rischert (ar280@yahoo.com) is the chairman of the database application development and design direction of the computer Technology and application Department of Columbia University. She wrote the Oracle SQL Interactive manual version 2nd (Prentice hall,2002) and the upcoming Oracle SQL sample (Prentice hall,2003). Rischert has over 15 years of experience as a database designer, DBA, and project director in Fortune 100 companies, and she has been using Oracle products since Oracle version 5.

Table 1: Locating meta characters

Metacharacters

Description

^

To position an expression at the beginning of a line

$

Position an expression to the end of a line

Table 2: quantifiers or repeat Operators

Quantifiers

Description

*

Match 0 or more times

?

Match 0 times or 1 times

+

Match 1 or more times

{m}

Exactly match m times

{m,}

Match at least m times

{m, n}

Match at least m times but not more than n times

Table 3: Predefined POSIX character classes

Character class

Description

[: Alpha:]

Alphabetic characters

[: Lower:]

Lowercase alphabetic characters

[: Upper:]

Uppercase characters

[:d Igit:]

Digital

[: Alnum:]

Alpha-Numeric characters

[: Space:]

White-space characters (prohibit printing), such as carriage returns, line breaks, vertical tabs, and page breaks

[:p UNCT:]

Punctuation characters

[: Cntrl:]

control characters (prohibit printing)

[:p rint:]

printable characters

Table 4: Substitution matching and grouping of expressions

Metacharacters

Description

|

Replace

Separate substitution options, usually used with the group operator ()

( )

Group

Group the subexpression into a substitution unit, quantifier unit, or back reference unit (see the "Back Reference" section)

[Char]

Character List

Represents a character characters; most of the metacharacters in a list of characters (except character classes, ^ and-metacharacters) are understood as text

Table 5:regexp_like operator

Grammar

Description

Regexp_like (source_string, pattern
[, Match_parameter])

Source_string supports character data types (char, VARCHAR2, CLOB, NCHAR, NVARCHAR2, and NCLOB, but does not include LONG). The pattern parameter is another name for the regular expression. Match_parameter allows optional parameters such as handling line breaks, retaining multiline formatting, and providing control over case sensitivity.

Table 6:regexp_instr function

Grammar

Description

Regexp_instr (source_string, pattern
[, Start_position
[, Occurrence
[, Return_option
[, Match_parameter]]]

The function looks for pattern and returns the first position of the schema. You can optionally specify the start_position you want to start your search. The occurrence parameter defaults to 1 unless you specify that you want to find a pattern that appears next. The default value of Return_option is 0, which returns the starting position of the pattern, and a value of 1 returns the starting position of the next character that matches the criteria.

Table 7: Description of 5 digit plus 4-bit zip code expression

Grammar

Description

Blanks that must match

[:d Igit:]

POSIX Digital Classes

]

End of character list

{5}

Word characters just repeat 5 times

(

The beginning of a child expression

-

A literal hyphen, because it is not a character list within the range metacharacters

[

The beginning of a character list

[:d Igit:]

POSIX [:d igit:] Class

[

The beginning of a character list

]

End of character list

{4}

Word characters just repeat 4 times

)

End parenthesis, ending subexpression

?

? Quantifiers match the grouped subexpression 0 or 1 times to make the 4-bit code optional

$

Locates Metacharacters, indicating end of line

Table 8:regexp_substr function

Grammar

Description

Regexp_substr (source_string, pattern
[, position [, occurrence
[, Match_parameter]]]

regexp_substr function returns a substring of the matching pattern.

Table 9:regexp_replace function

Grammar

Description

Regexp_replace (source_string, pattern
[, replace_string [, Position
[, occurrence, [match_parameter]]]

The function replaces the matching pattern with a specified replace_string, allowing for complex "search and replace" operations.

Table 10: Back reference meta characters

Metacharacters

Description

\digit

Back slash

followed by a number between 1 and 9, the backslash matches the first digit subexpression enclosed in parentheses.
(Note: A backslash has another meaning in a regular expression that, depending on the context, may also represent an Escape character.)

Table 11: Description of pattern Exchange regular expressions

Regular expression Items

Description

(

The beginning of the first subexpression

.

Match any single character except for line breaks

*

Repeat the operator that matches the previous. Metacharacters 0 to N times

)

The end of the first subexpression; match results in \1
In this example, the result is Ellen. )

Whitespace that must exist

(

The beginning of the second subexpression

.

Match any single character except the line feed

*

Repeat the operator that matches the previous. Metacharacters 0 to N times

)

The end of the second subexpression; match results in \2
In this example, the result is hildi. )

Blank

(

The beginning of the third subexpression tr>

.

Match any single character except for line breaks

*

Repeat the operator that matches the previous. Metacharacters 0 to N times

)

The end of the third subexpression; match results in \3
(In this case, the result is Smith.) )

Table 12: Social Security numbers description of regular expressions

Regular expression Items

Description

^

The beginning character (the regular expression cannot have any leading characters until it matches.) )

(

Start the subexpression and list the | Meta-character-separated replaceable options

[

The beginning of a character list

[:d Igit:]

POSIX Digital Classes

]

End of character list

{3}

Word characters just repeat 3 times

-

Hyphens

[

The beginning of a character list

[:d Igit:]

POSIX Digital Classes

]

End of character list

{2}

Word characters just repeat 2 times

-

Another hyphen

[

The beginning of a character list

[:d Igit:]

POSIX Digital Classes

]

End of character list

{4}

Word characters just repeat 4 times

|

Replaces the metacharacters; ends the first option and begins the next alternate expression

[

The beginning of a character list

[:d Igit:]

POSIX Digital Classes

]

End of character list

{9}

Word characters just repeat 9 times

)

Ends the parentheses, ending the child expression group for substitution

$

Locates metacharacters, indicating line endings; no extra characters to fit the pattern

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.