Java Regular Expressions: The pattern class and the Matcher-class explanation

Source: Internet
Author: User

Java.util.regex is a class library package that matches strings by using regular expression-ordered patterns. It consists of two classes: pattern and Matcher pattern a pattern is a regular expression that is compiled. Matcher a Matcher object is a state machine that matches a string to a matching pattern based on pattern objects. First, a pattern instance is used to customize the compiled pattern of a similar regular expression with Perl, and then a Matcher instance matches the string in the pattern control of the given pattern instance.

Let's take a look at these two categories as follows:

First, the concept of capturing group

The

Capturing group can be numbered from left to right by calculating its opening brackets, numbering starting at 1. For example, in an expression ((a) (b (c))), there are four such groups:

1        ((a) (b (c)))
2        (A)
3        (B (C))
4        ( C)
Group 0 always represents an entire expression. A group that begins with a (?) is a pure, non-capturing group that does not capture text and does not count against group totals.

The capture input associated with a group is always a sub-sequence that matches the group most recently. If the group is recalculated because of quantization, it retains its previously captured value (if any) on the second calculation failure, for example, the string "ABA" with an expression (a (b)?). + matches, the second group is set to "B". At the beginning of each match, all captured input is discarded.

Second, detailed pattern class and Matcher class

Java regular expressions are implemented by the pattern class and the Matcher class under the Java.util.regex package (it is recommended to open the Java API documentation when reading this article, and to see how the method description in the Java API will work better when you are in the process) .  The
pattern class is used to create a regular expression, or it can be said to create a matching pattern, which is constructed privately and cannot be created directly, but can be created by using the Pattern.complie (String regex) Simple factory method to create a regular expression.  
Java code example:  
Pattern p=pattern.compile ("\\w+");  
P.pattern ();//return to \w+ 

Pattern () returns the string form of a regular expression, which is actually the regex parameter that returns Pattern.complile (string regex)

1.pattern.split (charsequence Input)

Pattern has a split (Charsequence input) method that separates the string and returns a string[], and I guess String.Split (String regex) is through Pattern.split ( Charsequence input) to achieve this.
Java code example:
Pattern p=pattern.compile ("\\d+");
String[] Str=p.split ("My QQ is: 456456 My phone is: 0532214 my mailbox is: [email protected]");

Results: str[0]= "My QQ is:" str[1]= "My Phone is:" str[2]= "My mailbox is: [Email protected]"

2.pattern.matcher (string regex,charsequence input) is a static method that is used to quickly match a string, which is suitable for matching only once and matching all strings.

Java code example:
Pattern.matches ("\\d+", "2223");//Returns True
Pattern.matches ("\\d+", "2223AA");//returns false, needs to match to all strings to return true, where AA cannot match to
Pattern.matches ("\\d+", "22bb23");//returns false, needs to match to all strings to return true, where BB cannot match to

3.pattern.matcher (charsequence input)

Having said so much, it was finally the Matcher class, Pattern.matcher (Charsequence input) returned a Matcher object.
The construction method of the Matcher class is also private and cannot be created at will, only an instance of the class can be obtained through the Pattern.matcher (Charsequence input) method.
The pattern class can only do a few simple matching operations, in order to get stronger and more convenient regular matching operations, it is necessary to work with the pattern and Matcher. The Matcher class provides grouping support for regular expressions and multiple matching support for regular expressions.
Java code example:
Pattern p=pattern.compile ("\\d+");
Matcher m=p.matcher ("22bb23");
M.pattern ();//returns p that is the creation of which pattern object is returned by the Matcher object

4.matcher.matches ()/Matcher.lookingat ()/Matcher.find ()

The Matcher class provides three matching operation methods, three methods return a Boolean type, and return true when matched to, false if not matched to
Matches () matches the entire string and returns true only if the entire string matches.
Java code example:
Pattern p=pattern.compile ("\\d+");
Matcher m=p.matcher ("22bb23");
M.matches ();//returns false because the BB cannot be matched by \d+, causing the entire string match to be unsuccessful.
Matcher m2=p.matcher ("2223");
M2.matches ();//returns True because \d+ matches the entire string

Let's look back at Pattern.matcher (String regex,charsequence input), which is equivalent to the following code
Pattern.compile (regex). Matcher (Input). Matches ()

Lookingat () matches the preceding string, only the string that matches to the front returns true
Java code example:
Pattern p=pattern.compile ("\\d+");
Matcher m=p.matcher ("22bb23");
M.lookingat ();//returns True because \d+ matches to the previous 22
Matcher m2=p.matcher ("aa2223");
M2.lookingat ();//returns false because \d+ cannot match the previous AA

Find () matches a string that matches to a string that can be anywhere.
Java code example:
Pattern p=pattern.compile ("\\d+");
Matcher m=p.matcher ("22bb23");
M.find ();//Returns True
Matcher m2=p.matcher ("aa2223");
M2.find ();//Returns True
Matcher m3=p.matcher ("AA2223BB");
M3.find ();//Returns True
Matcher m4=p.matcher ("Aabb");
M4.find ();//return False

5.mathcer.start ()/Matcher.end ()/Matcher.group ()

When you perform a match operation using matches (), Lookingat (), find (), you can use the above three methods to get more detailed information.
Start () returns the index position of the substring that matches to the string.
End () returns the index position of the last character in the string that matches the substring.
Group () returns the substring that is matched to
Java code example:
Pattern p=pattern.compile ("\\d+");
Matcher m=p.matcher ("AAA2223BB");
M.find ();//Match 2223
M.start ();//Return 3
M.end ();//returns 7, returns the index number after 2223
M.group ();//Return 2223

Mathcer m2=m.matcher ("2223BB");
M.lookingat (); Match 2223
M.start (); Returns 0 because Lookingat () can only match the preceding string, so when Lookingat () is used, the start () method always returns 0
M.end (); Returns 4
M.group (); Returns 2223

Matcher m3=m.matcher ("2223BB");
M.matches (); Match entire string
M.start (); return 0, cause I'm sure everyone knows.
M.end (); Return 6, reason to believe that everyone is also clear because matches () needs to match all the strings
M.group (); Return 2223BB

Say so much, I believe we all understand the use of the above several methods, it is said that the regular expression of the grouping in Java is how to use.
Start (), End (), group () have an overloaded method they are start (int i), end (int i), group (int i) are dedicated to the grouping operation, and the Mathcer class has a groupcount () to return the number of groups.
Java code example:
Pattern P=pattern.compile ("([a-z]+) (\\d+)");
Matcher m=p.matcher ("AAA2223BB");
M.find (); Match aaa2223
M.groupcount (); Returns 2 because there are 2 groups
M.start (1); Returns 0 returns the index number in the string of the first set of matched substrings
M.start (2); Returns 3
M.end (1); Returns 3 returns the index position in the string of the last character of the first set of matched substrings.
M.end (2); Returns 7
M.group (1); Returns AAA, returning the first set of substrings to match
M.group (2); Returns 2223, returning the second set of substrings to match

Now let's use a regular matching operation with a slightly higher point, such as a text with a lot of numbers in it, and the numbers are separate, we're now going to take all the numbers out of the text, and it's easy to take advantage of Java's regular operations.
Java code example:
Pattern p=pattern.compile ("\\d+");
Matcher M=p.matcher ("My QQ is: 456456 My phone is: 0532214 my mailbox is: [email protected]");
while (M.find ()) {
System.out.println (M.group ());
}

Output:
456456
0532214
123

If you replace the above while () loop with the
while (M.find ()) {
System.out.println (M.group ());
System.out.print ("Start:" +m.start ());
System.out.println ("End:" +m.end ());
}
The output:
456456
Start:6 End:12
0532214
Start:19 end:26
123
Start:36 end:39

Now you should be aware that each time the match operation is performed, start (), End (), group () Three method values change, the information that changes to the substring to match to, and their overloaded methods, will also change to the corresponding information.
Note: Only if the match operation succeeds can you use Start (), End (), group () three methods, otherwise it will throw java.lang.IllegalStateException, that is, when matches (), Lookingat (), Find () can be used if either of the methods returns True.

Java Regular Expressions: The pattern class and the Matcher-class explanation

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.