JAVA Regular Expression matcher.group (int group)-related class parsing __java

Source: Internet
Author: User

In the related class matcher of Java regular expressions, there are several methods:
-Int GroupCount ()
-String Group (int group)
-int start (int group)
-int End (int group)
-String Group (string name)
-int Start (String name)
- The concept of the int end (String name) grouping Group

First, let's look at a piece of code to understand the concept of grouping in regular expressions.

Demo1

String Text = "John writes about this, and John writes about that," + "and John writes about everything."
String patternString1 = "(John)";
Pattern pattern = pattern.compile (patternString1);
Matcher Matcher = pattern.matcher (text);
System.out.println ("GroupCount is-->" + matcher.groupcount ());
while (Matcher.find ()) {
    System.out.println ("Found:" + matcher.group (1));
}

Output results are

GroupCount is–>1
Found:john
Found:john
Found:john

Demo2

String Text = "John writes about this, and John writes about that," + "and John writes about everything."
String patternString1 = "John";
Pattern pattern = pattern.compile (patternString1);
Matcher Matcher = pattern.matcher (text);
System.out.println ("GroupCount is-->" + matcher.groupcount ());
while (Matcher.find ()) {
    System.out.println ("Found:" + matcher.group (1));
}

The output results are:

GroupCount is–>0
Exception in thread "main" java.lang.IndexOutOfBoundsException:No Group 1

The only difference between the above two examples is that the value of the patternString1 is different, and the regular expression is shown with parentheses and without parentheses. Therefore, we can also simply understand that:

The contents of a regular expression that match a subexpression marked with ' () ' are a grouping (group).

Now let's continue to look at an example
Demo3

String Text = "John writes about this, and John writes about that," + "and John writes about everything."
String patternString1 = "(?: John)";
Pattern pattern = pattern.compile (patternString1);
Matcher Matcher = pattern.matcher (text);
System.out.println ("GroupCount is-->" + matcher.groupcount ());
while (Matcher.find ()) {
    System.out.println ("Found:" + matcher.group (1));
}

Output results:

GroupCount is–>0
Exception in thread "main" java.lang.IndexOutOfBoundsException:No Group 1

As you can see from the Demo3, a subexpression similar to (?:p Attern) format cannot be considered a grouping.

So the concept of grouping is summarized as follows:
1. The content that matches a subexpression marked with ' () ' in a regular expression is a grouping (group).
2. A subexpression similar to (?:p Attern) format cannot be considered a grouped grouping index Group number

or start with the demo.
Demo4

String Text = "John writes about this, and John Doe writes about that,"
                + "and John Wayne writes about everything."
String patternString1 = "(John) (. +?)";
Pattern pattern = pattern.compile (patternString1);
Matcher Matcher = pattern.matcher (text);
Matcher.find ();//Match string, matching string can be in any position
int start = Matcher.start ();//Returns the position of the currently matched string in the original destination string
int end = Matcher.end ()//returns the index position of the last character of the currently matched string in the original target string
System.out.println ("found Group:group (0) is '" + matcher.group (0 ));
System.out.println ("Found Group:group (1) is '" + matcher.group (1) + "', Group (2) is '" + matcher.group (2) + "'");

The output results are:

Found Group:group (0) is ' John writes
Found Group:group (1) is ' John ' and group (2) is ' writes '

As you can see from the output, when a regular expression contains multiple groups, which is a subexpression that contains more than one ' (pattern) ' format, its grouping index (group number) starts at 1, and Group (0) represents the entire matching string.
To facilitate the understanding of specific groupings and the concept of group numbering, refer to the following figure

With the above, we can fully understand the use of the group (int group) function. Summed up a few points: a regular subexpression that resembles the pattern format (except:p Attern) represents a grouped index that starts at 1. 0 represents the entire string that the regular expression matches, and group (i) represents the matching content of Group I GroupCount () function returns the number of groups in the current regular expression

Okay, now look at the int start (int group) and int end (int group) two functions
First, let's review the following two functions:
1. int start () returns the position of the currently matched string in the original destination string
2. int end () returns the index position of the last character of the currently matched string in the original target string.

Then, with the Int type argument group, it's actually returning the start index and end cable of the specified grouping as follows: Demo5

string text = "John writes about this, and John Doe writes about that," + "and J
Ohn Wayne writes about everything. "
String patternString1 = "(John) (. +?)";
Pattern pattern = pattern.compile (patternString1);
Matcher Matcher = pattern.matcher (text); Matcher.find ();//Match string, matching string can be in any position int start = Matcher.start ();//Returns the position of the currently matched string in the original destination string System.out.println (start );//0 int end = Matcher.end ();//returns the index position of the last character of the currently matched string in the original target string System.out.println (ends);//12 start = Matcher.start (1); /First packet match, which is the index position of John starting, 0 System.out.println (start);//0 start = Matcher.start (2);//The first match of the contents, that is, the index of the beginning of the writes, 5 System.out.println (start);//5 end = Matcher.end (1);//The first packet matches the content, that is, the index position of John's ending, 4 System.out.println (ends); Matcher.end (2);//The second packet matches the content, that is, the index position of the writes start, the SYSTEM.OUT.PRINTLN (end);//12 start = Matcher.start (3);//exception In thread "main" java.lang.IndexOutOfBoundsException:No Group 3 

Note that the last sentence, when you index is greater than the actual number of indexes in the regular expression, that is, the GroupCount () return value, throws an exception, so remember to handle this when you use it.
To sum up, you can summarize the following int start (int group) return the string that the current group matches to the position int end (int group) in the original destination string returns the index position of the last character in the original destination string for the string that matches the current group.

And finally, the following functions, String group (string name), int start (string name) and int end (string name), are not yet understood and await subsequent additions.

Reference:
Http://tutorials.jenkov.com/java-regex/matcher.html#group-method
Http://stackoverflow.com/questions/16517689/confused-about-matcher-group-in-java-regex

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.