Regular expressions exclude specific strings

Last Update:2015-12-11 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

additional knowledge: ASP. NET regular Get link

 Regex reg = new  Regex ( " href=\ "([^\"]+) \ "[^>]*> ([^<]+) </a>  "  , regexoptions.ignorecase); MatchCollection matches  = Reg. Matches (HTML);  foreach  (Match match in   matches) { if   match. Success) { string  urls = match. Groups[1  . ToString ();  string  name = match. Groups[2  . ToString ();}}

Additional knowledge point two: ASP. NET Regular filtering

" ^\\s* " string. Empty, regexoptions.compiled |  "\\r\\n"string. Empty, regexoptions.compiled | Regexoptions.multiline);//Filter line break

1. For example, find a string that does not begin with Baidu.
Baidu.com
sina.com.cn

Regular: ^ (?! Baidu). *$ Match result is line 2nd, that is, line 1th is excluded.
The 0 width assertion (?!) is used here. EXP), note that we have a forward lookup syntax (also called sequential surround) (? =exp)
(? =exp) will find the "position" before exp if the equal sign is replaced by an exclamation point, it becomes the negation semantics, that is, the location of the lookup cannot be followed by exp
Under normal circumstances?! To combine with a specific anchor point, such as the start of a line or the end of a line, the above example means the following:
^(?! Baidu). *$ matches the position at the beginning of a line and requires that the next position is not followed by the Baidu string. This is because the first line ^ after the position is Baidu so the match failed, was excluded.

2. For example, find a string that does not end in COM.
www.sina.com.cn
www.114369.cn
www.114345.cn
Www.114380.com
Www.i029.com

Regular ^.*? (? <!com) $ matches the first 3 rows of results.
If you look for a string that ends in COM, use regular ^.*? (? <=com) $ or ^.*?com$
Explanation of the regular expression: ^.*? (? <!com) $
First match the beginning of the line, then the. *? This is to ignore precedence, that is, to override any characters that do not match, (? <!com) This is a negative form of a reverse look around, meaning that the front of this position must match a position cannot be a string COM, and finally a line ends. For www.123.com, first to match the beginning of the line, and then match the position behind W, found that the front is not COM, so successful but then to match the end, failure, backtracking let. Match a W symbol, then (? <com) matches the position behind the second W, and find that the preceding is not a COM match success, followed by a failure to match the corresponding end of the line, until. * When the www.baidu.com is matched, the (? <!com) matches the position behind M, At this point, the front of this position is a COM match direct failure, and then. *? matches the end of the line, (? <!com) matches the position behind the $, and obviously fails this time, so the whole global match fails. Www.123.com is excluded from the match. Here's. * Followed by a question mark the result is the same.

3. Example to find rows that do not contain if
if (a>b)
printf ("Hello");
else if (a<b)
printf ("Hello2");
Else
printf ("Hello3");

Regular ^ ([^f]|[ ^I]F) +$
In fact, this match is also an excluded string match, but different from the above two, because the if may not be at the beginning of the line, but also not at the end of the row, but in the middle of the string so that the match caused trouble, in the regular expression does not provide similar exclusion features. The easiest thing to think about is the following:
^[^if]+$ this seems to be the case, but the exclusion character group excludes the I and F two characters, not the if string, so the regular expression matches those strings that have neither the I character nor the F character. But if there is one or more I or one or more f in the string, or I and F characters are all just not joined together. These are the cases where we need to match, and we can't match the ones that contain the if string, not the lines that contain the I or F characters, so this is a very big exploit.

^.*(?! IF). *$ this notation uses a 0-width assertion, the surface meaning seems to say that any character + non-if+ arbitrary characters make up the entire string, but carefully study the matching process to know that this is wrong, (if) match is a position, so for the string AIFB he can match to, And in fact such characters are exactly what we don't want. According to this regular expression, for AIFB first matches the beginning of the line, second. * is greedy mode (match first), will always match to the end of the string (at this time the drive is positioned in front of the position), at this time (?! If) need to match a position, this position can not be after the IF, this time exactly position at the back of the B character, match the matching conditions, followed by matching the end of the line, where the entire global match succeeds.

That is, for a string such as I want to exclude the string ABC, then for any of the strings HelloWorld ABC HelloWorld at the time of the match (?! ABC) can match the positions of H, E, L, L, O, W, O, R, L, D, and so on, all of which are successful. So the match didn't go to ABC this place at all, (?! ABC) will match the success. This time does not have the role of exclusion, why the above 1th and 2nd examples can be, because their position has the beginning of the line and the end of the limit. For example, I want to match the beginning of the line is not ABC, then this time ^ (?! ABC) this time (?! ABC) in fact, the position of the actuator at the time of the match is limited by the beginning of the line, so the match will fail for those strings that start with ABC.

For the regular expression ^.* (?! ABC). *$ How can I get the first. * To match the HelloWorld problem in the helloworldabcxxx.

For the above topic, our answer is ^ ([^f]|[ ^I]F) +$ In fact, all the matches are divided into 2 cases, one case is that there is no F character in the string, there is no way to have an if string, in which case the matching string is impossible to have if. The second case is the F character, but we ask that the front of F not be I, so in cases where F and no f two are considered, the regular should be able to satisfy all situations.

In fact, the answer to this question is not perfect, for the excluded string if only 2 characters I and F characters, we can use this way, but if we want to exclude the string HelloWorld, this method is obviously not practical, how to consider how many cases?

In this case we use ^ (?!. *helloworld). *$ Regular Expression We move the first. * To the inside of the 0-width assertion. Matches the position of the beginning of the line first, then matches the position after the beginning of the line, requiring that this position cannot be followed by a. *helloworld string, which, in white, requires that the location cannot be followed by a xxxxxxxxxxxxxxxxxxhelloworld-like string , which excludes HelloWorld from the beginning of the start of the line.

The expression I wrote earlier

A (? <! VIRTUAL).) +b

End with a B, no virtual string

Regular expressions exclude specific strings

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Regular expressions exclude specific strings

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Regular expressions exclude specific strings

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support