String "does not contain" matching techniques in regular expressions

Source: Internet
Author: User

Often we encounter the idea of finding text that does not contain a string, which is most likely to be used in regular expressions ^(hede) to filter the "Hede" string, but this is wrong. We can write: [^hede] but such a regular expression is entirely another meaning, it means that the string cannot contain ' h ', ' e ', ' d ' three characters. What kind of regular expression can filter out information that does not contain the complete "Hello" string?

In fact, it is not entirely correct to say that the regular expression does not support inverse matching. As with this problem, we can use negative lookup to simulate inverse matching to solve our problem:

^((?! Hede).) *$

The above expression can filter out information that does not contain the ' Hede ' string. As I said above, this notation is not a regular expression of "good" usage, but it can be used in this way.

Explain

A string is made up of n characters. There is a null character before and after each character. Thus, a string of n characters has a n+1 empty string. Let's take a look at the string "ABHEDECD":

All e-numbered positions are null characters. The expression (?!hede). will look forward to see if there is no "Hede" string in front of it, and if it is not (the other character), then the . (dot) will match these other characters. The "find" of this regular expression is also called "Zero-width-assertions" (0-width assertion), because it does not capture any characters, just to judge.

In the above example, each null character checks whether the string in front of it is not ' hede ', and if not, this . is the match-catch character. The expression is (?!hede). executed only once, so we wrap the expression in parentheses into groups (group) and then decorate it with * (an asterisk)--match 0 or more times: ((?!hede).)* .

You can understand that the regular expression ((?!hede).)* matches the result of "ABhedeCD" the string false, because in the e3 position, (?!hede) the match is unqualified, it preceded the "hede" string, that is, the specified string is contained.

In regular expressions, whether the formula ?! is looking forward, it helps us solve the problem that the string "does not contain" a match.

[English Original: Regular expression to match string not containing a word?]

Article from: Foreign periodicals It review

String "does not contain" matching techniques in regular expressions

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.