Java regular expressions-Greedy, Reluctant, Possessive, and java Regular Expressions
Quantifiers |
Greedy |
Reluctant |
Possessive |
Match |
X? |
X?? |
X? + |
X, neither once nor once |
X* |
X*? |
X* + |
X, zero or multiple times |
X+ |
X+? |
X++ |
X, once or multiple times |
X{N} |
X{N}? |
X{N} + |
X, EXACTLY n times |
X{N,} |
X{N,}? |
X{N,} + |
X, at least n times |
X{N,M} |
X{N,M}? |
X{N,M} + |
X, at least n times, but not more than m times |
Greedy: Greedy; Reluctant: Barely; Possessive: exclusive.
Test 1:
Package jichu; import java. util. regex. matcher; import java. util. regex. pattern; public class MainClass {public static void main (String [] args) {Matcher m1 = Pattern. compile ("\ w + "). matcher ("ababa"); // greedy Matcher m2 = Pattern. compile ("\ w +? "). Matcher ("ababa"); // barely Matcher m3 = Pattern. compile ("\ w ++ "). matcher ("ababa"); // exclusive System. out. println (piPei (m1); System. out. println (piPei (m2); System. out. println (piPei (m3);} public static String piPei (Matcher m) {StringBuffer s = new StringBuffer (); int I = 0; while (m. find () {s. append ("{matched substring" ++ I + ":" + m. group () + ";"); s. append ("start position:" + m. start () + ";"); s. append ("end position:" + m. end () + ";}") ;}If (s. length () = 0) {s. append (" not matched! ");} S. insert (0," (mode "+ m. pattern (). pattern () +"): "); return s. toString ();}}
Print:
(Mode \ w +): {match substring 1: ababa; start position: 0; end position: 5;} (mode \ w + ?) : {Match substring 1: a; start position: 0; end position: 1 ;}{ match substring 2: B; start position: 1; end position: 2 ;} {match substring 3: a; start position: 2; end position: 3 ;}{ match substring 4: B; start position: 3; end position: 4 ;} {match substring 5: a; start position: 4; end position: 5 ;}( mode \ w ++): {match substring 1: ababa; start position: 0; end position: 5 ;}
Test 1 shows that:
1. For greedy, all characters will be matched at one time;
2. For a stubborn match, it will match one by one from left to right;
3. for exclusive characters, the same as greedy ones, it also matches all characters at a time;
Test 2: (modify the main method based on test 1)
Public static void main (String [] args) {Matcher m1 = Pattern. compile ("\ w + B "). matcher ("ababa"); // greedy Matcher m2 = Pattern. compile ("\ w +? B "). matcher ("ababa"); // barely Matcher m3 = Pattern. compile ("\ w ++ B "). matcher ("ababa"); // exclusive System. out. println (piPei (m1); System. out. println (piPei (m2); System. out. println (piPei (m3 ));}
Print:
(Mode \ w + B): {match substring 1: abab; start position: 0; end position: 4;} (mode \ w +? B): {match substring 1: AB; start position: 0; end position: 2 ;}{ match substring 2: AB; start position: 2; end position: 4;} (mode \ w ++ B): no match found!
We can see from testing 1 and 2:
1. For greedy, '\ w +' matches all characters at a time. When the mode is followed by 'B', it does not match, and then one character is backtracked.
2. For a stubborn left-to-right match, two substrings are matched.
3. for exclusive characters, '\ w ++' has matched all characters at a time. When 'B' is added after the mode, this does not match, unlike greedy, it does not backtrack, so the matching fails.
Summary
1. The Greedy quantizer is "Greedy". Like a name, the Greedy quantizer will include as many matching characters as possible and trace back.
2. the Reluctant quantizer is "stubborn" and adopts the "adequacy" principle. It will match as few characters as possible.
3. The Possessive quantizer is "exclusive". It will match as many characters as possible as Greedy, but it will not backtrack.
More content related to regular expressions:
Java regular expression rule table
Java Pattern
Matcher class in java