Java regular expression, Regular Expression
I. Overview
A regular expression is a string of characters describing a character sequence. It can be used to find matching items in other character sequences. Regular Expressions support two types: Pattern and Matcher. Pattern is used to define regular expressions and Matcher is used to match patterns in other sequences.
Go Back to Top 2. Create a regular expression
Creating a regular expression is to create a special string.
Regular Expression compilation expression: the regular expression specified as a string must first be compiled into an instance of this class, and then the obtained mode is used to create a Matcher object. All statuses involved in the execution match reside in the same pattern.
Public class Test {public static void main (String [] args) {Pattern pat = Pattern. compile ("a * B"); Matcher mat = pat. matcher ("aaaaab"); if (mat. matches () {System. out. print ("match ");}}}
The above code is equivalent
If ("aaaaab". matches ("a * B") {System. out. print ("match ");}
The Pattern object can be used multiple times. If a regular expression only needs to be used once, you can directly use the static matches method of the Pattern class.
Constructor of a regular expression:
Structure |
Match |
Character |
X |
CharacterX |
\\ |
Backslash characters |
\ 0N |
With octal values0CharacterN(0<= N <=7) |
\ 0Nn |
With octal values0CharacterNn(0<= N <=7) |
\ 0Mnn |
With octal values0CharacterMnn(0<= M <=3. 0<= N <=7) |
\ XHh |
With hexadecimal value0xCharacterHh |
\ UHhhh |
With hexadecimal value0xCharacterHhhh |
\ T |
Tab ('\ U0009') |
\ N |
New Line (line feed) character ('\ U000a') |
\ R |
Carriage Return ('\ U000d') |
\ F |
Page feed ('\ U000c') |
\ |
Alarm (bell) operator ('\ U0007') |
\ E |
Escape Character ('\ U001B') |
\ CX |
CorrespondsXController |
Character class |
[Abc] |
A,BOrC(Simple class) |
[^ Abc] |
Any characterA,BOrC(No) |
[A-zA-Z] |
AToZOrAToZ, Two letters included (range) |
[A-d [m-p] |
AToDOrMToP:[A-dm-p](Union) |
[A-z & [def] |
D,EOrF(Intersection) |
[A-z & [^ bc] |
AToZ,BAndC:[Ad-z](Minus) |
[A-z & [^ m-p] |
AToZ, RatherMToP:[A-SCSI-z](Minus) |
Predefined character classes |
. |
Any character (may or may not match the line terminator) |
\ D |
Number:[0-9] |
\ D |
Non-numeric:[^ 0-9] |
\ S |
Blank characters:[\ T \ n \ x0B \ f \ r] |
\ S |
Non-blank characters:[^ \ S] |
\ W |
Word character:A-zA-Z_0-9 |
\ W |
Non-word characters:[^ \ W] |
Boundary |
^ |
Start of a row |
$ |
End of a row |
\ B |
Word boundary |
\ B |
Non-word boundary |
\ |
Start of input |
\ G |
Last matched end |
\ Z |
The end of the input. It is only used for the last terminator (if any) |
\ Z |
End of input |
Greedy quantifiers |
X? |
X, Neither once nor once |
X* |
X, Zero or multiple times |
X+ |
X, Once or multiple times |
X{N} |
X, ExactlyNTimes |
X{N,} |
X, At leastNTimes |
X{N,M} |
X, At leastNTimes, but no moreMTimes |
Reluctant quantifiers |
X?? |
X, Neither once nor once |
X*? |
X, Zero or multiple times |
X+? |
X, Once or multiple times |
X{N}? |
X, ExactlyNTimes |
X{N,}? |
X, At leastNTimes |
X{N,M}? |
X, At leastNTimes, but no moreMTimes |
Possessive quantifiers |
X? + |
X, Neither once nor once |
X* + |
X, Zero or multiple times |
X++ |
X, Once or multiple times |
X{N} + |
X, ExactlyNTimes |
X{N,} + |
X, At leastNTimes |
X{N,M} + |
X, At leastNTimes, but no moreMTimes |
Logical operators |
XY |
XFollowedY |
X|Y |
XOrY |
(X) |
X, used as a capture group |
Regular Expressions support the following number of identifiers:
Greedy (Greedy mode): by default, the number of characters in Greedy mode is used. This mode keeps matching until it cannot be matched.
Reluctant (barely mode): Use the question mark suffix (?) . It only matches the minimum number of characters.
Possessive (possession mode): it is represented by the plus sign suffix (+), which is generally rarely used.
Eg:
Public class RegExp {public static void main (String [] args) {Pattern pat1 = Pattern. compile ("\ w. * AB "); Pattern pat2 = Pattern. compile ("\ w. *? AB "); Matcher mat1 = pat1.matcher (" bbbbab aaab jjjjj is "); Matcher mat2 = pat2.matcher (" bbbbab aaab jjjjjj is "); System. out. println ("----- greedy mode -------"); while (mat1.find () {System. out. println (mat1.group (); // It always matches in greedy mode and outputs "bbbbab aaab"} System. out. println ("----- barely mode -------"); while (mat2.find () {System. out. println (mat2.group (); // In barely running mode, the matching will be minimized. The output "bbbbab 'newline 'aaab "}}}
Output result:
Back to Top 3. Use Regular Expressions
No constructor is defined for the Pattern class. The Pattern is created when the compile () method is called.
The Matcher class does not have constructor. Instead, a Matcher is created when the matches method defined by Pattern is called. As long as Mather is created, various Pattern matching operations can be performed using its method.
matches
Method to match the entire input sequence with this pattern.
lookingAt
Try to match the input sequence from the beginning to the pattern.
find
The method scans the input sequence to find the next subsequence that matches the pattern.
Each method returns a Boolean value indicating success or failure. You can obtain the status by querying the check box.