Http://www.cnblogs.com/mu-mu/archive/2013/02/06/2893581.html
Recently, in order to do the HTML file source processing, you need to do regular lookup and replace. So by this opportunity to study the regular system, although used to the regular, but each time is a temporary learning a mixed customs. In the course of learning still encountered a lot of problems, especially 0 wide assertion (here also to spit trough, the web is full of copy and paste content, encountered a problem to see a lot of duplicate things, Khan!!! ), so write down your own understanding here, convenient for later review!
0 width Positive lookahead assertion What is it, see the official explanation definition on MSDN
(? = sub-expression )
(0 width Positive lookahead assertion.) The sub-expression continues to match only if the subexpression matches the right side of this position. For example,\w+ (? =\d) matches the word followed by a number, not the number.
Classic example: A word ends with ing to get the contents of ing in front of
var reg = new Regex (@ "\w+ (? =ing)"); var str = "Muing"; Console.WriteLine (Reg. Match (str). Value);//Return MU
The above is an example that is visible on the Internet, and here you may understand that the original is the return of the EXP expression before the content.
And look at the code below.
var reg = new Regex (@ "A (? =b) C"); var str = "ABC"; Console.WriteLine (Reg. IsMatch (str));//return False
Why does it return false?
In fact, the official MSDN definition has been said, but it is officially said. Here's a key point to note: this location. Yes, it's a position, not a character. Then the second example is understood in the context of the official definition and the first example:
Since A is followed by B, the match is returned at this time (known by the first example, only a does not return the exp match), at this point a (? =b) C in the A (? =b) section has been resolved, the next to solve the C matching problem, when the match C to start from the string ABC, combined with the official definition , we know that it starts from the position of the sub-expression, then it starts from the position of B, but B does not match the C of the remainder of a (? =b) C, so ABC does not match a (? =b) c.
So if you want to match the above, how should the regular write?
The answer is: a (? =b) BC
Of course, some people would say that the direct ABC match up, and so toss it? Of course you don't have to do this, just to illustrate the 0-width positive lookahead assertion. The same principle about the other 0 wide assertions!
--------------------------------------------------------------------------------------------------------------- ------------------------------------------------------
(? = sub-expression) |
(0 width Positive lookahead assertion.) The sub-expression continues to match only if the subexpression matches the right side of this position. For example, \w+ (? =\d) matches the word followed by a number, not the number. This construct does not backtrack. |
(? <= sub-expression) |
(0 width is being recalled after the assertion is issued.) The sub-expression continues to match only if the subexpression matches the left side of this position. For example, (? <=19) 99 matches an instance of 99 followed by 19. This construct does not backtrack. |
Here's what I understand:
(1). 0 width
This means that the match is a location (loaction) instead of a subexpression.
(2). Forecast first, review after the hair
(? = subexpression), lookahead, returns the previous position matching the subexpression, from left to right.
(? <= subexpression), recalling the post, returns the position that matches the sub-expression, right-to-left.
We can imagine that there is a pointer to the current match in the pattern matching process, so that when the sub-expression matches the Predictor pointer to the front of the sub-expression, recalling the post is behind. You can refer to the above code.
Zero-width assertion of the regular expression (0-width positive lookahead assertion)