Java regular expression, Regular Expression

Source: Internet
Author: User
Tags character classes

Java regular expression, Regular Expression
I. Overview

A regular expression is a string of characters describing a character sequence. It can be used to find matching items in other character sequences. Regular Expressions support two types: Pattern and Matcher. Pattern is used to define regular expressions and Matcher is used to match patterns in other sequences.

Ii. Create a regular expression

Creating a regular expression is to create a special string.

Regular Expression compilation expression: the regular expression specified as a string must first be compiled into an instance of this class, and then the obtained mode is used to create a Matcher object. All statuses involved in the execution match reside in the same pattern.

Public class Test {public static void main (String [] args) {Pattern pat = Pattern. compile ("a * B"); Matcher mat = pat. matcher ("aaaaab"); if (mat. matches () {System. out. print ("match ");}}}

The above code is equivalent

If ("aaaaab". matches ("a * B") {System. out. print ("match ");}

The Pattern object can be used multiple times. If a regular expression only needs to be used once, you can directly use the static matches method of the Pattern class.

Constructor of a regular expression:

Structure Match
Character
X CharacterX
\\ Backslash characters
\ 0N With octal values0CharacterN(0<= N <=7)
\ 0Nn With octal values0CharacterNn(0<= N <=7)
\ 0Mnn With octal values0CharacterMnn(0<= M <=3. 0<= N <=7)
\ XHh With hexadecimal value0xCharacterHh
\ UHhhh With hexadecimal value0xCharacterHhhh
\ T Tab ('\ U0009')
\ N New Line (line feed) character ('\ U000a')
\ R Carriage Return ('\ U000d')
\ F Page feed ('\ U000c')
\ Alarm (bell) operator ('\ U0007')
\ E Escape Character ('\ U001B')
\ CX CorrespondsXController
Character class
[Abc] A,BOrC(Simple class)
[^ Abc] Any characterA,BOrC(No)
[A-zA-Z] AToZOrAToZ, Two letters included (range)
[A-d [m-p] AToDOrMToP:[A-dm-p](Union)
[A-z & [def] D,EOrF(Intersection)
[A-z & [^ bc] AToZ,BAndC:[Ad-z](Minus)
[A-z & [^ m-p] AToZ, RatherMToP:[A-SCSI-z](Minus)
Predefined character classes
. Any character (may or may not match the line terminator)
\ D Number:[0-9]
\ D Non-numeric:[^ 0-9]
\ S Blank characters:[\ T \ n \ x0B \ f \ r]
\ S Non-blank characters:[^ \ S]
\ W Word character:A-zA-Z_0-9
\ W Non-word characters:[^ \ W]
Boundary
^ Start of a row
$ End of a row
\ B Word boundary
\ B Non-word boundary
\ Start of input
\ G Last matched end
\ Z The end of the input. It is only used for the last terminator (if any)
\ Z End of input
Greedy quantifiers
X? X, Neither once nor once
X* X, Zero or multiple times
X+ X, Once or multiple times
X{N} X, ExactlyNTimes
X{N,} X, At leastNTimes
X{N,M} X, At leastNTimes, but no moreMTimes
Reluctant quantifiers
X?? X, Neither once nor once
X*? X, Zero or multiple times
X+? X, Once or multiple times
X{N}? X, ExactlyNTimes
X{N,}? X, At leastNTimes
X{N,M}? X, At leastNTimes, but no moreMTimes
Possessive quantifiers
X? + X, Neither once nor once
X* + X, Zero or multiple times
X++ X, Once or multiple times
X{N} + X, ExactlyNTimes
X{N,} + X, At leastNTimes
X{N,M} + X, At leastNTimes, but no moreMTimes
Logical operators
XY XFollowedY
X|Y XOrY
(X) X, used as a capture group

Regular Expressions support the following number of identifiers:

Greedy (Greedy mode): by default, the number of characters in Greedy mode is used. This mode keeps matching until it cannot be matched.

Reluctant (barely mode): Use the question mark suffix (?) . It only matches the minimum number of characters.

Possessive (possession mode): it is represented by the plus sign suffix (+), which is generally rarely used.

Eg:

Public class RegExp {public static void main (String [] args) {Pattern pat1 = Pattern. compile ("\ w. * AB "); Pattern pat2 = Pattern. compile ("\ w. *? AB "); Matcher mat1 = pat1.matcher (" bbbbab aaab jjjjj is "); Matcher mat2 = pat2.matcher (" bbbbab aaab jjjjjj is "); System. out. println ("----- greedy mode -------"); while (mat1.find () {System. out. println (mat1.group (); // It always matches in greedy mode and outputs "bbbbab aaab"} System. out. println ("----- barely mode -------"); while (mat2.find () {System. out. println (mat2.group (); // In barely running mode, the matching will be minimized. The output "bbbbab 'newline 'aaab "}}}

Output result:

  

3. Use Regular Expressions

No constructor is defined for the Pattern class. The Pattern is created when the compile () method is called.

The Matcher class does not have constructor. Instead, a Matcher is created when the matches method defined by Pattern is called. As long as Mather is created, various Pattern matching operations can be performed using its method.

  • matchesMethod to match the entire input sequence with this pattern.

  • lookingAtTry to match the input sequence from the beginning to the pattern.

  • findThe method scans the input sequence to find the next subsequence that matches the pattern.

Each method returns a Boolean value indicating success or failure. You can obtain the status by querying the check box.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.