I feel that the article and the reply are both good. I have reprinted it.-Use a regular expression to find words that do not contain the continuous string ABC.

Source: Internet
Author: User
ArticleDirectory
    • Comment
I wrote an article "getting started with regular expressions for 30 minutes". Some readers may ask:

[^ ABC] indicates that it does not contain any character in A, B, and C. How can I write an expression that does not contain a string ABC?

For myself, the simplest solution to this problem is to useProgramming LanguageTo find out the lazy style that contains ABC, and the rest is not included. However, I wrote a tutorial. Readers may not all have programming basics. Some of them just use some tools to extract some information from the TXT document, therefore, you must use a regular expression to answer the question.

So I opened regextester and started the experiment. First I tried to use it ((? 'Test' ABC) | .)*(? (Test )(?!)) (Meaning: Search for ABC or any character. If ABC is found, store it in the group named test and check whether there is any content in the group test, if a match fails, see the tutorial.) The result is "ABC", "AABC", "ABCD", and "AA, it seems that this solution is not feasible after the test group exists at the end.

Then I tried again (.(?! ABC) * (find all the characters that are not followed by ABC), and the result is "ABC". "ABCD" passed the test. "AABC" only intercepts the following "ABC ", obviously not.

Then try to enhance the condition :((? <! ABC ).(?! ABC) * (locate all the characters whose front and back are not ABC). The result is that all strings containing ABC only intercept "ABC ", if ABC is not included, it is passed directly.

It seems a bit confusing now, but how can we filter out strings containing ABC internally? In other words, how does one match the whole, not the part? Now we need to clarify the user's requirements: if the user wants to find a word, add \ B to both ends of the expression. If you want to find a line, add ^ and $. Because the user's problem is not clearly stated, I think it is a word.

So the expression \ B ((? <! ABC ).(?! ABC) * \ B. After testing, this expression can match all words that do not contain ABC and the word ABC.

How to exclude the word ABC? After some thought, I think it is most convenient to determine whether a word starts with a: \ B ((?! BC) | [^ A] (?! ABC ))((? <! ABC ).(?! ABC) * \ B (either not starting with a of BC or not starting with a, except that all the characters after the start must be prefixed and not followed by ABC ). Tested to fully meet the requirements, bingo!

Use a regular expression to search for words that do not contain a consecutive string ABC. The final result is:\ B ((?! BC) | [^ A] (?! ABC ))((? <! ABC ).(?! ABC) * \ B
----------------
Update: according to the comments of maple, the more concise method is:\ B ((?! ABC) \ W) + \ B

Posted on deerchao read (2759) Comment (18) EDIT favorites

Comment # 1st floor 221.221.165. * maple [unregistered users]

Can I use this?

((?! ABC ).)*? Reply to reference

#2 floor [ Landlord ] Deerchao

@ Maple
I tried it, ((?! ABC ).)*? It seems that nothing can be matched, right? View reply reference

#221.221.165. * maple [unregistered user]

Add \ B to both ends
The result is similar to that of your expression.
However, to match a single word, you also need to modify the reply reference.

#4 floor 221.221.165. * maple [unregistered user]

Sorry, it should be
\ B ((?! ABC).) * \ B

Hehe wrote a question mark to reply to the reference.

#221.221.165 .* Maple [unregistered user]

Try this expression again.

\ B ((?! ABC) \ s) +? \ B

I don't know if I understand the meaning of the question, right? This should be able to match words that do not contain ABC. Of course, ABC will exclude the reply reference.

# 6th floor [ Landlord ] Deerchao

Yes, your expression can achieve the same effect, and it is more concise. View reply reference

#222.75.41. * Apple [unregistered users]

Hello, I have benefited a lot, but I want to match 3-5 strings starting with letters or numbers, excluding HTTP. I cannot match anything in this way !!
/^ [A-zA-Z0-9] {3, 5} B ((?! HTTP) \ s) +? $/Reply to reference

# Floor 8 [ Landlord ] Deerchao

Is there an additional B in the middle of your expression?
If the first 3-5 letters or numbers cannot contain HTTP, you can use:
^ ((?! HTTP) [a-zA-Z0-9]) {3, 5 }((?! HTTP) \ W) * $
If you can include HTTP, you can use:
^ ([A-zA-Z0-9]) {3, 5 }((?! HTTP) \ W) * $

Note: I used the. NET engine for testing. The results may be different under the JavaScript engine. View reply reference

#222.75.41. * Apple [unregistered users]

Thank you very much for your prompt reply. the. NET engine is a little different. Haha, no, but my friends upstairs gave me some inspiration: \ B ((?! HTTP) \ s) +? \ B; but how to limit the length ?? Reply to reference

#10 floor [ Landlord ] Deerchao

I tried it. Javascript should be okay:

Test.htm
<SCRIPT>
VaR valid = "abcdefgh ";
VaR invalid = "acdsihttp ";

VaR Reg =/^ ((?! HTTP) [a-zA-Z0-9]) {3, 5 }((?! HTTP) \ W) * $ /;
Alert (Reg. Test (valid ));
Alert (Reg. Test (invalid ));
</Script> View the reply reference

#11 floor Felix

I have asked a question that has plagued me many times. Although it is similar, I did not solve it.

Known: <[^>] +> can match all HTML tags to replace the TAG content and remove them to form a pure text

However, I want to exclude the and <A> XXX </a> tags when matching and replacing them so that they can be retained in the HTML document, but I do not know how to write them, I hope this will help me with research. Thank you very much !! View reply reference

# Floor 12 [ Landlord ] Deerchao

@ Felix
<(?! (/? (IMG | A) \ B) [^>] *>

Match <HTML> <a href = "#"> XXX </a> </img> <Div id = "AAA"> XYZ </Div> <HTML>
<Div id = "AAA">
</Div>
</Html> View the reply reference

#13 floor, Felix

The building lord is really fierce. The test passed. Thank you !! View reply reference

#14 floor, Felix

Can I have a problem with MSN or Gtalk?
Lf1981 # MSN, com
Gtalk replies to references in the same way.

# 15f 207.46.55 .* Peach [unregistered user]

Useful! But cannot understand
Welcome to Peach
Http://www.meetao.cn/reply reference

# 16th floor 118.144.40.*2008-04-19 sonic_andy [unregistered users]

([^ A] | A [^ B] | AB [^ C]) reply to reference

# 17f 124.227.192.*2008-05-29 zxylxw for beginners [unregistered users]

If I want to find out that the string contains ABC, but how can I exclude @ ABC?
For example, in. Net's access "pseudo-storage process ",
I want to replace the first ABC in "ABC" = "@ ABC" with "Chinese field name"
Change to "Chinese field name" = "@ ABC". Please advise! Reply to reference

# 18th floor 123.120.174.*2008-08-18 Booker [unregistered users]

Hello, blogger. I use Vim to process logs. Now I want to keep only the log records whose username is temp, but it does not seem to exist in vim ?! For this regular expression, I use:
/^. * \ (Temp \) \ {0}. * $

The results cannot be matched. I wonder if there is any good solution for the blogger? Thank you for your reference.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.