In the previous article, "The regular expression of the idea of programming ," the principle of regular expression, the use of methods and the common regular expression summary, this article will further explore the Java regular expression greedy, reluctant,possessive three different strategies.
From The official Java documentation http://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html we can see that Regular expressions represent three sets of symbols for the number of words, namely greedy (greedy), reluctant (grudging), and possessive (exclusive). The implication is as follows:
Greedy number of words |
X? |
X, not once or once |
x* |
X, 0 or more times |
x+ |
X, one or more times |
X{n} |
X, exactly n times |
X{n,} |
X, at least n times |
X{N,M} |
X, at least n times, but not more than m times |
|
|
Reluctant number of words |
X?? |
X, not once or once |
X*? |
X, 0 or more times |
X+? |
X, one or more times |
X{n}? |
X, exactly n times |
X{n,}? |
X, at least n times |
X{n,m}? |
X, at least n times, but not more than m times |
|
|
Possessive number of words |
x?+ |
X, not once or once |
x*+ |
X, 0 or more times |
X + + |
X, one or more times |
x{n}+ |
X, exactly n times |
x{n,}+ |
X, at least n times |
x{n,m}+ |
X, at least n times, but not more than m times |
Greedy,Reluctant,possessivethe DifferenceInstance talk
Looking at the table above, we find that the meanings of these three quantitative words are the same ( e.g. X? , X?? , x?+ are not once or once), but there are some subtle differences between them. Let's first look at an example:
1.Greedy
public static void Testgreedy () { Pattern p = pattern.compile (". *foo"); String strText = "Xfooxxxxxxfoo"; Matcher m = P.matcher (StrText); while (M.find ()) { System.out.println ("matched form" + m.start () + "to" + m.end ()); } } |
Results:
Matched form 0 to 13
2.Reluctant
public static void Testreluctant () { Pattern p = pattern.compile (". *?foo"); String strText = "Xfooxxxxxxfoo"; Matcher m = P.matcher (StrText); while (M.find ()) { System.out.println ("matched form" + m.start () + "to" + m.end ()); } } |
Results:
Matched form 0 to 4
Matched Form 4 to 13
3.Possessive
public static void Testpossessive () { Pattern p = pattern.compile (". *+foo"); String strText = "Xfooxxxxxxfoo"; Matcher m = P.matcher (StrText); while (M.find ()) { System.out.println ("matched form" + m.start () + "to" + m.end ()); } } |
Results:
no match succeeded
Principle explanation
The greedy number word is called "greedy" because the match is forced to read into the entire input string the first time the match is attempted, and if the first attempt to match fails, then the word Fu backwards from backward and tries to match again until the match succeeds or no characters can be rolled back.
Pattern string:. *foo
Find string:xfooxxxxxxfoo
Result:matched form 0 to
The comparison process is as follows
The reluctant uses the opposite approach to greedy , which starts at the first ( character ) position of the input string. , only reluctantly read one character in an attempt to match the search until the entire string is exhausted.
Pattern string:. *foo
Find string:xfooxxxxxxfoo
Results:matched form 0 to 4
Matched Form 4 to 13
The comparison process is as follows
Possessive number words are always read into the entire input string, try once ( and only once ) to match successfully, unlike greedy, possessive Never fall back, even if doing so may make the overall match successful.
Pattern string:. *foo
Find string:xfooxxxxxxfoo
Results:
no match succeeded
The comparison process is as follows
Reference article:http://docs.oracle.com/javase/tutorial/essential/regex/quant.html
Take a look at a few more examples:
Mode string:. +[0-9]
Find string:abcd5aabb6
Result:matched form 0 to ten
Pattern string:. +?[ 0-9]
Find string:abcd5aabb6
Results:matched form 0 to 4
Pattern string:. { 1,9}+[0-9]
Find string:abcd5aabb6
Result:matched form 0 to ten
Pattern string:. { 1,10}+[0-9]
Find string:abcd5aabb6
Result: match failed
If you have any doubts and ideas, please give feedback in the comments, your feedback is the best evaluator! Due to my limited skills and skills, if the Ben Boven have errors or shortcomings, please understand and give your valuable advice!
======================== Welcome to the series of articles on programming ideas ========================
The regular expression of programming thought
An iterative device for programming ideas
Recursion of programming thought
The callback of programming thought
The difference between greedy reluctant possessive in Java regular expression