Regular Expression Engine
The special characters that describe the matching mode and the general characters;
Bre POSIX
Basic regular expression,
Ere
Extended regular expression;
Gawk
Supported
Ere
,
Sed
Not supported.
1)
Regular Expression mode:
L
Case Sensitive
L
Space as common characters
L
The usage mode of a tool or language is
/Pattern/
Special characters:
. * [] ^ $ {}/+? | ()
Bre
"
/
"
Escape
Escape
Character
Escape is also required for the oblique bearing, although it is not a special character-because
/
Used as expression mode?
"
^
", Positioning character, starting from scratch
For example
Sed-n
'/^ ABC/P'
;
^
Must be placed at the beginning of the regular expression; otherwise, it is a common character.
"
$
", Positioning character, search for the end
Sed-n'/ABC $/P'
;
Eg:
Search the entire row
, '/^ This $/P'
; Search for empty rows
'/^ $/D ';
"
.
"
Match any single character except the line break. It must have one character;
"
[]
", Character class
Character class
(One of several characters), such
Sed-n'/[YY] ES/P'
Negative character class, use
[^]
, Such
[^ CH]
, Indicating that it does not contain these characters, but still needs one character;
Define the character range, such
[0-9]
,
[A-Z]
,[
A-cx-z
] Defining multiple ranges
Bre
Special character class:
[[: Alpha:]
Any letter, case sensitive
[[: Alnum:]
Any letter, number
[[: Blank:]
Space or tab
[[: Digit:]
Number
[[: Lower:]
Any lower case
[[: Print:]
Any printable
[[: Punct:]
Punctuation Marks
[[: Space:]
Any blank character,
Including vertical tabs,
Such as space,
Lt
,
NL
,
Cr
And so on
[[: Upper:]
Any capital
[[: Cntrl:]
ASCII
Control characters
[[: Graph:]
Non-control non-space
[[: Xdigit:]
16
Hexadecimal number
"
*
"
*
The preceding characters may not appear or appear multiple times,
/Colou * r/
"
.*
", A word that appears in any position on a line of text
*
Used for character classes:
Eg:
$ Echo "BT" | sed-n'/B [AC] * t/P'
# TBD
:
AC
Can appear at will?
Ere
"
?
"
Same
*
Different: does not appear or only appears once, can also be used with character class
"
+
"
Same
*
Must appear once or multiple times
"
{}
"
Limit the number of repeated characters
(Interval)
,
{M}
Or
{M, n}
,
/{Min, max /}
,{
M,
} Indicates at least
M
Times
Be {1} t
#
Equivalent
Bet
?
Gawk
Parameters must be specified
-- Re-interval
,
Eg:
Echo
"BT" | gawk -- Re-interval '/be {1} t/{print $0 }'
Used for character classes:
Gawk
-- Re-interval "/B [AE] {1, 2} t"
"
|
"
Logic
Or
, Specifying multiple regular expressions,
Gawk '/CAT | dog /'
"
()
"
Group, a combination is processed as a standard character,
Eg:
Gawk
'/SAT (urday )? /
#
Requirements
Urday
Appears
0
To
1
Times;
Gawk
'/(C | B) A (B/T )/'
# TBD
: Equivalent
'[CB] a [BT]'?