Regular expressions match tricks that do not contain certain strings--regular expressions

Source: Internet
Author: User

Often we encounter text that wants to find a string that does not contain it, the easiest way for programmers to think of is to filter the "Hede" string with a regular expression, but this is hede. We can write like this: [^hede], but such a regular expression is entirely another meaning, it means that the string cannot contain ' h ', ' e ', ' d ' three but characters. What kind of regular expression can filter out information that does not contain a complete "hello" string?

In fact, it is not entirely correct to say that reverse matching is not supported in regular expressions. Like this problem, we can use negative lookup to simulate reverse matching to solve our problem:

Copy Code code as follows:
^((?! Hede).) *$

The expression above can filter out information that does not contain ' hede ' strings. As I said above, this is not a regular expression "good" use, but it can be used in this way.

Explain

A string is made up of n characters. Before and after each character, there is a null character. Thus, a string of n characters has a n+1 empty string. Let's take a look at the "ABHEDECD" string:

All e-numbered positions are null characters. An expression (?! Hede). Will look forward to see if there is no "Hede" string, if not (is another character), then. (point number) matches these other characters. The "lookup" of this regular expression is also called "Zero-width-assertions" (0 width assertion), because it does not capture any characters, just to judge.

In the example above, each null character checks whether the string in front of it is not ' hede ', if not, this. (point number) is the match to capture this character. An expression (?! Hede). Execute only once, so we wrap the expression in parentheses in groups (group) and then use the * (asterisk) modifier--to match 0 or more times:

Copy Code code as follows:
((?! Hede).) *。

You can understand that regular expressions (?! Hede).) * The result of the match string "ABHEDECD" is false, because in the E3 position, (?!) Hede) match is not qualified, it preceded by a "Hede" string, which contains the specified string.

In regular expressions,?! Whether the stereotype looks forward, it helps us solve the problem that the string "does not contain" the match.

Here are some additions:

Share the PHP generated random number of three methods, generate 1-10 between the repeat random number, PHP generation of random numbers of examples, need a friend reference.

To see the Regex golf on Hacker News, there are some interesting regular expression questions that need to be used to match the mismatch, such as the need to match a string that does not contain a word.

Before we start, let's take a look at the syntax of regular expressions:

[ABC] A or B or C. Any single character a? 0 or one A
[^ABC] Any character that is not ABC \s space A * 0 or more A
[A-z] A-z any character \s not a space + one or more a
[A-za-z] A-Z or a-Z \d any number a{n} happens n times a
^ Line opening \d arbitrary non-numeric A{n,} at least n times a
$ n-m \w Any alphanumeric or underlined a{n,m} at the end of a line
(...) Parentheses for grouping \w any non-alphanumeric or underscore a *? 0 or more a (not greedy)
(a|b) A or b \b Word boundary (a) ... \1 reference grouping
(? =a) Front has a (?!) A \b not a word boundary before

Regular expressions have (? =a) and (?! A) to indicate whether we need to match something.

So, there is a need to not match a certain content, you can use (?! a). For example, to match a string that does not contain hello, you can write this.

Copy Code code as follows:

^(?!. *hello)

here. * to indicate that there may be other characters before the Hello, why should I add ^, because if not added, it may match to the position after H.

Now we can solve the problem of ABBA on the Regex golf.
The problem is to match words that do not contain ABBA, such as Abba,anallagmatic should not match.

Regular Expression code:

Copy Code code as follows:

^(?!. *(.) (.) \2\1)

Then use the mismatch, can also solve the problem of prime, this problem matching a few x string, first look at the regular.

Copy Code code as follows:

^(?! (xx+) \1+$)

(xx+) is a match of 2 and more than 2 x, (xx+) \1+ is a match repeated occurrence of 2 and more strings, so (xx+) \1+ on the number of the number of primes, then the prime string is to remove these non-prime strings, that is, the above regular expression.

PS: About regular, this site also provides 2 very simple and practical regular testing tools for everyone to use:

JavaScript Regular expression online test tool:
Http://tools.jb51.net/regex/javascript

Regular expression online generation tool:
Http://tools.jb51.net/regex/create_reg

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.