The content here is about location matching in the regular expression. It contains two parts, one of which is relatively simple.AnchorAndWord Divider, Some of which are more complexAssertion with Zero Width. Complexity has the advantages of complexity. Writing is complicated, and the location that can be matched is more complicated. Zero-width assertions can also be calledView. This part is important.
Anchor
There are several standard anchors, which are ^, $. In general, it does not involve anyModeIn this case, the escape character (^) matches the starting position of the text. $ Matches the position above the line break \ n at the end of the text. For example
Regular Expression: S $
The above regular expression indicates matching the row ending with the character "S", that is, a string such as "· s/n.
Word Divider
Word delimiters are also supported in different tools. Generally, \ B is the best, and other tools also support \ <and \>. \ B matches the boundary of a word, that is, it can match the start or end of a word. The start position and end position of the word are matched respectively. This metacharacter is very simple. However, for each tool, its definition of "word character" has different meanings, which may lead to different tools and different content that can be matched by word delimiters. But in general, as long as it is a normal word, such as [Happy], [new] words will certainly be able to match. Word delimiters do not analyze the semantics of words. The rule is usually only simple connected characters. Therefore, text like [a1d2c3d4] can also be matched successfully. However, words like [I .t] cannot be matched.
Loop view (zero-width assertion)
ViewAlso calledAssertion with Zero Width. It can be understood literally as follows: Zero Width means that the matched width is 0 (if it matches 1 character, the width is 1, and the matching position cannot match any character, if the width is 0), assertions must satisfy certain conditions. SoAssertion with Zero WidthIt means the position matched under certain conditions.
ViewThere are two types (sequential and reverse). They are
[The sequence is certainly the loop view (? = Expression )]
[Sequential negative view (?! Expression )]
[Forward view (? <= Expression )]
[Sequential negative view (? <! Expression )]
The sequential view is the matching from the left to the right, and the reverse order is the matching from the right to the left. However, the use of reverse order is limited in the current regular expression. This limit is put at the end of the lecture. First, let's start with the context matching.
For the sequence, the view (? = Expression? =) Expression matches the character on the right.SuccessfulExpression (? = Expression) MatchSuccessfulAnd report that the engine matches successfully at the current position. When expression matches the character on the rightFailedExpression (? = Expression) MatchFailed.
Negative sequential view (?! Expression ?! Expression matches the characters on the right.FailedExpression (?! Expression) MatchSuccessfulAnd report that the engine matches successfully at the current position. When expression matches the character on the rightSuccessfulExpression (?! Expression) MatchFailed.
For a forward view (? <= Expression), the subexpression (division? <= Other expressions) expression matches the character on the left.SuccessfulExpression (? <= Expression) MatchSuccessfulAnd report that the engine matches successfully at the current position. When expression matches the character on the leftFailedExpression (? <= Expression) MatchFailed.
For reverse-order negative view (? & Lt! Expression? & Lt! Expression matches the character on the left.FailedExpression (? & Lt! Expression) MatchSuccessfulAnd report that the engine matches successfully at the current position. When expression matches the character on the leftSuccessfulExpression (? & Lt! Expression) MatchFailed.
The following is an example of a network.Sequential negative view.
Source string: AA <p> one </P> BB <div> two </div> CC
Regular Expression: <(?! /? P \ B) [^>] +>
This regular expression matches all tags except <P ·> or </P>.
As shown in the figure, <matches itself in this regular expression ,(?! /? P \ B) is a sequential negative expression. The subexpression is </? P \ B, which means that the right side of the expression cannot be the character [/P] or [p]. The question mark indicates a match or a mismatch. You should remember it. Then [^>] + [^ ·] is an excluded character group, indicating that all characters except [>] can be matched. The number of matching characters is the least once, no limit. The last expression> matches itself.
The meaning of the above expression is: match a text such as <(<The right cannot be P or/P) (n characters except>)>. In this example, we can also see that expressions can usually be split into subexpressions, and finally connect them to form a complete expression.
Let's look at an example of positive backward view.
Source string: <div> A test </div>
Regular Expression :(? <= <Div>) [^ <] + (? = </Div>)
This regular expression matches the content between the <div> and </div> labels, excluding the <div> and </div> labels.
(? <= <Div>) means that the left side cannot be <div>, while (? = </Div>) indicates that the right side cannot be </div>.
For more information, see the second reference. Let's go back and talk about it.Reverse View. The following conditions are required for reverse view:Reverse view can only be used in text with a limited length. This means that the length of the text that can be matched by the expression must be limited. If the expression is (? <! Books ?) Or (? <! \ W +) cannot match because their length cannot be determined. For this reason, some tools do not support reverse view.
Reference: proficient in Regular Expressions
RegEx basics-View