Often we encounter the idea of finding text that does not contain a string, which is most likely to be used in regular expressions ^(hede)
to filter the "Hede" string, but this is wrong. We can write: [^hede]
but such a regular expression is entirely another meaning, it means that the string cannot contain ' h ', ' e ', ' d ' three characters. What kind of regular expression can filter out information that does not contain the complete "Hello" string?
In fact, it is not entirely correct to say that the regular expression does not support inverse matching. As with this problem, we can use negative lookup to simulate inverse matching to solve our problem:
^((?! Hede).) *$
The above expression can filter out information that does not contain the ' Hede ' string. As I said above, this notation is not a regular expression of "good" usage, but it can be used in this way.
Explain
A string is made up of n characters. There is a null character before and after each character. Thus, a string of n characters has a n+1 empty string. Let's take a look at the string "ABHEDECD":
All e-numbered positions are null characters. The expression (?!hede).
will look forward to see if there is no "Hede" string in front of it, and if it is not (the other character), then the .
(dot) will match these other characters. The "find" of this regular expression is also called "Zero-width-assertions" (0-width assertion), because it does not capture any characters, just to judge.
In the above example, each null character checks whether the string in front of it is not ' hede ', and if not, this .
is the match-catch character. The expression is (?!hede).
executed only once, so we wrap the expression in parentheses into groups (group) and then decorate it with *
(an asterisk)--match 0 or more times: ((?!hede).)*
.
You can understand that the regular expression ((?!hede).)*
matches the result of "ABhedeCD"
the string false, because in the e3
position, (?!hede)
the match is unqualified, it preceded the "hede"
string, that is, the specified string is contained.
In regular expressions, whether the formula ?!
is looking forward, it helps us solve the problem that the string "does not contain" a match.
[English Original: Regular expression to match string not containing a word?]
Article from: Foreign periodicals It review
String "does not contain" matching techniques in regular expressions