The surround and match of regular expressions

Source: Internet
Author: User


Look

The final result of the surround look match is a location with four types of surround look:

The (? =expression) order is sure to look around, indicating that the right side of the position matches Expression

(?! expression) , which indicates that the right side of the position does not match the expression

(? <=expression) in reverse order, which indicates that the left side of the position matches the Expression

(? <! expression) reverse the negative look, indicating that the left side of the position does not match expression


You can use the following two regular expressions to understand the surround look:

(1) Letters, numbers, special symbols all appear, at least 8 people

Pattern pattern = Pattern.compile ("^ (? =.*[\\d]+) (? =.*[a-za-z]+) (? =.*[#$_!] +) [\\w#$!] {8,}$ "); Matcher Matcher = Pattern.matcher ("123456a890#"), while (Matcher.find ()) {System.out.println (Matcher.group ()));}

The expression means that the beginning position should be a position, followed by a number, a letter, a special symbol, followed by at least 8 specified characters.

(2) Letters, numbers, special symbols appear at least two, 6-20-bit

Pattern pattern = Pattern.compile ("^ (?! [\\d]+$) (?! [a-za-z]+$) (?! [!#$%^&*]+$] [\\da-za-z!#$%^&*]{6,20}$]; Matcher Matcher = Pattern.matcher ("aaaaaaaaaaa!"); while (Matcher.find ()) {System.out.println (Matcher.group ());}

The expression means that the beginning and end of the middle cannot be all numbers, not all letters, not all special symbols, and then the specified characters appear 6 to 20 times.


Three ways to match

Match first (greedy match)

Ignore precedence (lazy match)

Occupy priority


The key to understanding these three approaches is to understand backtracking , and the following code assists in understanding these three ways:

    public static void main (String[] args)     {         pattern pattern = pattern.compile ("A.*a");         matcher matcher = pattern.matcher ("Ababa");         if  (Matcher.find ())          {            system.out.println ( Matcher.group ());        }         else        {             system.out.println ("Mismatch");        }         pattern = pattern.compile ("A.*?a");         matcher =&Nbsp;pattern.matcher ("Ababa");        if  (Matcher.find ())          {             system.out.println (Matcher.group ());        }         else        {             system.out.println ("Mismatch");         }        pattern = pattern.compile ("A.*+a");         matcher = pattern.matcher ("Ababa");         if  (Matcher.find ())         {             system.out.println (Matcher.group ());         }        else         {            system.out.println ("Mismatch") ;         }    }

Output:

Ababaaba does not match


Solidification Grouping and possession matching

The following text and examples take the value of "Mastering regular Expressions"

Let's say we have a problem where we keep two decimal places like 3.690000023 decimals, and a decimal number like 2.3563895 keeps three decimal places, that is, if the third digit of the decimal is 0, then two decimal places are reserved and 0 decimal places are reserved if it is three.

=~ s/(\.\d\d[1-9]?) \d*/$1/;

This expression can work perfectly, in the ointment, when is similar to 3.695, we replace. 695 with the. 695, wasted effort. To solve this problem, we modify the expression a little bit.

=~ s/(\.\d\d[1-9]?) \d+/$1/;

Just replace the asterisk with a plus sign, which means that there is at least one digit outside the parentheses. It looked perfect, but there was a fatal mistake, and. 695 was replaced by. 69. What is this for? After the expression (\.\d\d[1-9]) matches. 695, the back of the \d+ does not match, in order for the entire expression to match successfully, the engine must backtrack, [1-9]? The matching numbers had to be spit out, so 5 was \d+ matched. This is the problem of backtracking. In fact, in this case, we do not want the engine backtracking, there are two ways to force the engine to abandon backtracking, curing groups and share words.

=~ s/(\.\d\d (? >[1-9]?)) \d+/$1/;   # Curing Group =~ s/(\.\d\d[1-9]?+) \d+/$1/; # Share Words

After the engine abandons backtracking, the above expression will not match. 695, that's exactly what we want.

The precedence quantifiers are similar to match-first quantifiers, except that they never return the matched characters.

You might think that the possession of the priority quantifier and the curing group is very close. "\w++" such as the "(? >\w+)" Match the result is exactly the same, but it is more convenient to write. Use a priority quantifier, "(\.\d\d (? >[1-9]?)) \d+"write" (\.\d\d[1-9]?+) ^\d+".

Be sure to differentiate between "(? >m) +" and "(? m+) ". The former discards the standby state created by "m" because "m" does not create any state, so it is of little value. And the latter gives up the unused state of "m+" creation, which is obviously meaningful.

Compare "(? >m) +" and "(? >m+)", which clearly corresponds to "m++", but if the expression is complex, for example

(\\"| [^"]) *+

When converting from a priority quantifier to a curing group, you tend to think of adding '?> ' in parentheses to get (? >\\ "|[ ^ "]) *. This expression may have the opportunity to achieve your purpose, but it is clearly not equal to the expression that uses the priority quantifier; it is like "m++" writing "(? >m) +". The correct approach is to remove the plus sign that represents a priority, and to include the remainder with a cure group:

(?> (\ \ "|[ ^"])*)

The procedure above can be verified with the following code:

    public static void main (String[] args)     {         String[] datas = {  "1.234001",  "1.234 ", " 1.230 ", " 1.23 " };        Pattern pattern  = pattern.compile ("(\\.\\d\\d[1-9]?) \\d* ");        for  (String data : datas)          {             matcher matcher = pattern.matcher (data);             if  (Matcher.find ())              {                 System.out.println (Matcher.group (1));            }            else             {                 system.out.println ("Mismatch");             }        }                 system.out.println ("========");                 pattern =  Pattern.compile ("(\\.\\d\\d[1-9]?) \\d+ ");        for  (String data : datas)          {             matcher matcher = pattern.matcher (data);             if  (Matcher.find ())              {                 system.out.println (Matcher.group (1));             }            else             {                 system.out.println ("Mismatch");             }        }                 system.out.println ("========");         pattern = pattern.compile ("(\\.\\d\\d (? >[1-9]?)) \\d+ ");        for  (String data : datas)          {            matcher matcher  = pattern.matcher (data);             if   (Matcher.find ())             {                 system.out.println ( Matcher.group (1));            }             else             {                 system.out.println ("Mismatch");             }        }                 System.out.println ("========");                 pattern = pattern.compile ("(\\.\\d\\d (? >[1-9]?+)) \\d+");         for  (String data : datas)          {            Matcher  Matcher = pattern.matcher (data);             if  (Matcher.find ())             {                 system.out.println ( Matcher.group (1));            }             else            {                 system.out.println ("mismatch");             }         }    }

Output

The. 234.234.23.23========.234.23.23 mismatch ========.234 mismatch. 23 mismatch ========.234 mismatch. 23 mismatch


Other references

[Essence] Regular Expression 30-minute introductory tutorial

http://www.oschina.net/question/12_9507


The blog of "A Journey of a mile"

http://blog.csdn.net/shangboerds/article/category/1124700


"Wild Goose Over No trace" blog

http://blog.csdn.net/lxcnn/article/category/538256



This article is from the "self-reliance, tenet" blog, please be sure to keep this source http://wangzhichao.blog.51cto.com/2643325/1747621

The surround and match of regular expressions

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.