Detailed Java regular expressions in the pattern class and Matcher class _java

Source: Internet
Author: User

Objective

This article describes the pattern classes and Matcher classes in Java regular expressions. First we need to be clear that the regular expression specified as a string must first be compiled into an instance of the pattern class. So how to better understand these two classes, is what programmers must know.

Let's take a look at these two classes separately:

First, the concept of capturing groups

Capturing groups can be numbered from left to right by calculating their open brackets, numbering starting at 1. For example, in an expression ((A) (B (C)), there are four such groups:

1  ((a) (b (c)))
2  (a)
3  (b (c))
4  (c)

Group 0 always represents an entire expression. The group that begins with (?) is a pure, non-capturing group that does not capture text and is not counted for group totals.

The capture input associated with a group is always the child sequence that matches the group most recently. If the group is recalculated again for quantification, the value that was previously captured (if any) will be preserved if the second calculation fails, for example, to "ABA" the string with an expression (a (b)) + matches, the second group is set to "B". At the beginning of each match, all captured inputs are discarded.

Second, the detailed pattern class and Matcher class

Java regular expressions are implemented through the pattern class and the Matcher class under the Java.util.regex package (it is recommended that you open the Java API documentation when you read this article, and look at the method descriptions in the Java API for better results when you do this.)

The pattern class is used to create a regular expression, or it can be said to create a matching schema that is private and cannot be created directly, but can be created by Pattern.complie(String regex) a simple factory method.

Java code example:

Pattern P=pattern.compile ("\\w+"); 

pattern() Returns the string form of a regular expression, which is actually Pattern.complile(String regex) the regex parameter returned

1.pattern.split (charsequence input)

Pattern has a split(CharSequence input) method that separates the string and returns a string[], which I guess String.split(String regex) is done by Pattern.split(CharSequence input) .

Java code example:

Pattern P=pattern.compile ("\\d+"); 

Results: str[0]= "My QQ is:" str[1]= "My Phone is:" str[2]= "My mailbox is: aaa@aaa.com"

2.pattern.matcher (string regex,charsequence input) is a static method that is used to quickly match a string, which is suitable for matching only once, and matches all strings.

Java code example:

Pattern.matches ("\\d+", "2223");/return True 
pattern.matches ("\\d+", "2223AA");//return FALSE, you need to match all strings to return true. Here AA can't match to 

3.pattern.matcher (charsequence input)

Say so much, finally turn to Matcher class debut, Pattern.matcher(CharSequence input) return a Matcher object.
The construction method of the Matcher class is also private and cannot be created at will, only by Pattern.matcher(CharSequence input) means of an instance of the class.
Pattern classes can only do some simple matching operations, to get stronger and more convenient regular matching operation, it needs to be the pattern and matcher work together. The Matcher class provides group support for regular expressions and multiple matching support for regular expressions.

Java code example:

Pattern P=pattern.compile ("\\d+"); 
Matcher m=p.matcher ("22bb23"); 

4.matcher.matches ()/Matcher.lookingat ()/Matcher.find ()

The Matcher class provides three matching action methods, three methods return a Boolean type, returns True when a match is returned, false if it does not match

matches()Matches the entire string and returns true only if the entire string is matched

Java code example:

Pattern P=pattern.compile ("\\d+"); 
Matcher m=p.matcher ("22bb23"); 
M.matches ()//returns false because BB cannot be matched by \d+, causing the entire string to match unsuccessfully. 
Matcher m2=p.matcher ("2223"); 
M2.matches ()//returns True because \d+ matches the entire string

We'll look back now and Pattern.matcher(String regex,CharSequence input) it's equivalent to the following code
Pattern.compile(regex).matcher(input).matches()

lookingAt()Matches the preceding string, and only the string that matches to returns true at the front

Java code example:

Pattern P=pattern.compile ("\\d+"); 
Matcher m=p.matcher ("22bb23"); 
M.lookingat ();//returns True because the \d+ matches to the preceding 
Matcher m2=p.matcher ("aa2223"); 

find()Matches a string that matches to a string that can be in any location.

Java code example:

Pattern P=pattern.compile ("\\d+"); 
Matcher m=p.matcher ("22bb23"); 
M.find ()//returns True 
Matcher M2=p.matcher ("aa2223"); 
M2.find ()//returns True 
Matcher M3=p.matcher ("aa2223bb"); 
M3.find ()//returns True 
Matcher M4=p.matcher ("Aabb"); 

5.mathcer.start ()/Matcher.end ()/Matcher.group ()

When used matches() , lookingAt() and find() after a matching operation, you can use the above three methods to get more detailed information.

start()returns the index position of the substring to be matched to in the string.

end()returns the index position of the last character of the substring to be matched to in the string.

group()returns the substring to match

Java code example:

Pattern P=pattern.compile ("\\d+"); 
Matcher m=p.matcher ("aaa2223bb"); 
M.find ()///Match 2223 M.start ()// 
return 3 
m.end ();//Return 7, return 2223 index number 
m.group ();//return 2223 

mathcer m2= M.matcher ("2223bb"); 
M.lookingat (); Match 2223 
M.start ();//return 0, because Lookingat () can only match the preceding string, so when the Lookingat () match is used, the start () method always returns 0 
m.end ();//returns 4 
M.group ();//Return to 2223 

Matcher m3=m.matcher ("2223bb"); 
M.matches (); Match the entire string 
m.start ();//return 0, cause I'm sure we're clear. 
m.end ();//return 6, cause I'm sure everyone knows, because matches () needs to match all strings 

Said so much, I believe that we all understand the use of the above methods, it is said that the regular expression of the grouping in Java is how to use.
start() , end() group() There is an overloaded method that is start(int i) dedicated to end(int i) group(int i) grouping operations, and the Mathcer class also has a groupCount() number of groups to return.

Java code example:

Pattern P=pattern.compile ("([a-z]+) (\\d+)"); 
Matcher m=p.matcher ("aaa2223bb"); 
M.find (); Match aaa2223 
M.groupcount ();//return 2 because there are 2 groups of 
M.start (1);//Return 0 Returns the index number of the substring in the string that matches the first group 
M.start (2);//Return 3 
m.end (1);//Return 3 returns the index position of the last character in the string that matches the first set of substrings. 
M.end (2); Return 7 
M.group (1);//Return AAA, return the substring of the first set 

Now we're going to use a slightly higher level of the regular matching operation, for example, there is a text, there are a lot of numbers, and these numbers are separate, we now have to take out all the numbers in the text, take advantage of Java's regular operation is so simple.

Java code example:

Pattern P=pattern.compile ("\\d+"); 
Matcher M=p.matcher ("My QQ is: 456456 My phone is: 0532214 My mailbox is: aaa123@aaa.com"); 
while (M.find ()) { 
  System.out.println (M.group ()); 

Output:

456456 
0532214 

If you replace the above while() loop into

while (M.find ()) { 
  System.out.println (M.group ()); 
  System.out.print ("Start:" +m.start ()); 
  System.out.println ("End:" +m.end ()); 

The output:

456456 
start:6 end:12 
0532214 
start:19 end:26 
123 

Now you should know that each time a match is performed, the values of the start() end() group() Three methods change, changing to the information that matches the substring, and their overloaded methods, and they change to the appropriate information.

Note: This can only be used if the matching operation is successful, start() end() group() three methods, or it will be thrown, that is, java.lang.IllegalStateException when matches() lookingAt() find() any one of the methods returns True.

Summarize

Above all the content of this article, I hope the content of this article for everyone's study or work can bring some help, if there is doubt you can message exchange, thank you for the cloud Habitat Community support.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.