Java Regular Expressions

Source: Internet
Author: User

When doing crawler, originally is ready to use regular, but see jsoup very good, there is no learning, just in the ancient poetry extraction, but also use the regular expression, or learn to forget.

Description

The article focuses on the http://www.cnblogs.com/ggjucheng/p/3423731.html, plus own a little understanding.

The syntax for regular expressions can be consulted:

Http://www.runoob.com/regexp/regexp-syntax.html

Java regular expressions are primarily about the two classes in Java.util.regex:

1.Pattern: The regular expression after the compiled representation mode.

2.Matcher: A Matcher object is a state machine that matches a string to a matching pattern based on the pattern object.

The relationship between the two is that the Matcher is working on string matching under the pattern given by the mode control.

Before formally starting regular expression learning, you should first understand the concept of capturing groups.

I. concept of capturing groups

The capturing group is the content that matches the expression of a regular expression, which is stored in memory, and is conveniently referenced by numbers or groups of displayed commands. There are two main types, with the following syntax:

(1) Common capturing group: (expression)

Most languages are supported, which is the focus of our study.

(2) named capturing group: (? <name>expression)

No analysis is pending.

It is important to note that except for these two syntaxes, the other syntax is not a capturing group.

Number of capturing groups: the opening parenthesis "(") is counted from left to right, and starts at 1, since number 0 is the regular expression overall.

Example:

Matches the date formatted as YYYY-MM-DD.

private static void Study1 () {        String line = "2016-02-15";        String pattern = "(\\d{4})-(\\d{2}-(\\d\\d))";        ([0-9]) = (//d)        The//pattern object is a regular expression of the compiled representation of the        pattern R = pattern.compile (pattern);        The Matcher object is the engine that interprets and matches the input string.        Matcher m = r.matcher (line);        System.out.println ("group:" + M.groupcount ());        if (M.find ()) {            System.out.println ("Found value:" + m.group (0));//all            System.out.println ("Found Value:" + M.group (1));//(\\d{4})            System.out.println ("Found value:" + m.group (2));//(\\d{2})            System.out.println (" Found value: "+ m.group (3));//(\\d\\d)        }else {            System.out.println (" No match ");}    }

The output result is

Group:3
Found value:2016-02-15
Found value:2016
Found value:02-15
Found value:15

Two. Use of the pattern class

Used to create a regular expression, or a matching pattern, whose construction method is private, but you can create a regular expression by using the Pattern.compile (regex) Simple factory method.

Example: viewing the regex parameter

Pattern pattern = pattern.compile ("\\w+"); System.out.println (Pattern.tostring ()); System.out.println (Pattern.pattern ());

Results

\w+
\w+

Returns a string form of a regular expression.

(1) pattern.split (charsequence input)

Used to delimit a string and return a string [], is the String.Split (string regex) implemented through this?

Pattern pattern = pattern.compile ("\\d");        String [] Strarray =pattern.split ("My QQ is: 456456 My phone is: 0532214 my mailbox is: [email protected]");        for (String aftersplit:strarray) {            System.out.println (aftersplit);        }

The result is:

Str[0]= "My QQ is:" str[1]= "My Phone is:" str[2]= "My mailbox is: [Email protected]"

(2) Pattern.matcher (String regex,charsequence input)

A static method that quickly matches a string, applies only once, and matches all of the strings.

Example:

(3) Pattern.matcher (charsequence input)

  Matcher appears , this method returns a Matcher, which has no public constructor method (private) and can only be obtained by Pattern.matcher.

As we see above, pattern can only do one simple match, and to get a more powerful regular match, the Matcher,matcher provides support for grouping and multiple matching of regular expressions.

Example:

Pattern p = pattern.compile ("\\d+");        Matcher m = P.matcher ("22bb33");        System.out.println (M.pattern ());        System.out.println (M.tostring ());

Result is

\d+
java.util.regex.matcher[pattern=\d+ region=0,6 Lastmatch=]

As you can see, matcher.tostring () will return the matching case.

(4) Matcher.matches ()/matcher.lookingat ()/matcher.find ()

The Matcher class provides three matching operation methods, all of which return a Boolean type, which returns True when the match succeeds.

Matches ()

Matches the entire string and returns true only if the entire string matches.

Pattern p = pattern.compile ("\\d+");        Matcher m = P.matcher ("22bb33");        System.out.println (M.matches ());        Matcher m2 = p.matcher ("2233");        System.out.println (M2.matches ());

return results

False, True

We can see that the following two types of code are equivalent:

Pattern.matches (String regex,charsequence input)

Pattern.compile (regex). Matcher (Input). Matches ()

Lookingat ()

Matches the preceding string, and only the string that matches to the front returns true.

Pattern p = pattern.compile ("\\d+");        Matcher m = P.matcher ("22bb33");        System.out.println (M.lookingat ()); Ture        Matcher m2 = p.matcher ("aa2233");        System.out.println (M2.lookingat ());//flase

Find ()

The string that matches to can be in any position.

(5) Matcher.start ()/matcher.end ()/matcher.group ()

After performing a matching operation using matches ()/lookingat (), find (), you can use the above three methods to get more detailed information.

Start (): Returns the position of the index that matches the substring in the string.

End (): Returns the position of the last character in the string that matches the substring.

Group (): Returns the substring that is matched to.

Pattern p = pattern.compile ("\\d+");        Matcher m = P.matcher ("AAAA2222BB");        M.find ();//        System.out.println (M.start ());//4        System.out.println (M.end ())  ; 8        System.out.println (M.group ());//2222        Matcher m2 = p.matcher ("2222BB");        M2.lookingat ();        System.out.println (M2.start ());//0        System.out.println (M2.end ());  4        System.out.println (M2.group ());//222        Matcher m3 = p.matcher ("222");        M3.matches ();        System.out.println (M3.start ());//0        System.out.println (M3.end ());  3        System.out.println (M3.group ());//222

The following is the use of the regular expression grouping, start (), End (), group (), which have an overloaded method that joins the parameter int index for grouping operations.

Example:

Pattern p = pattern.compile ("([a-z]+) (\\d+)")//2 groups        Matcher m = P.matcher ("aaa2223bb444");        System.out.println (M.groupcount ());//2 while        (M.find ()) {            System.out.println ("====one match");            All            System.out.print (M.start (0) + ",");//0,            System.out.print (m.end (0) + ",");   7            System.out.println (m.group (0));  aaa2223            //group 1            System.out.print (M.start (1) + ",");//0            System.out.print (m.end (1) + ",");   3            System.out.println (M.group (1));  AAA            //goup 2            System.out.print (M.start (2) + ",");//3            System.out.print (M.end (2) + ",");  7            System.out.println (M.group (2));//2223        }

Output

2
====one match
0,7,aaa2223
0,3,aaa
3,7,2223
====one match
7,12,bb444
7,9,bb
9,12,444

The source code shows that no parameters are directly called when the value of the parameter is 0, it is important to note that only if the match is successful, you can use Start,end,group.

In the last example, find the numbers in the string:

        Pattern p = pattern.compile ("\\d+");        Matcher m = P.matcher ("My QQ is: 456456 My phone is: 0532214 my mailbox is: [email protected]");        while (M.find ()) {            System.out.println (M.group ());        }    

Java Regular Expressions

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.