Cfc4n asked (?!) . After the answer, I summarized how to match the element "no" in Perl and paste it here. The Problem description includes the following text. How do I use regular expressions to match items without the color option? 12345678 itemcolor: red; itemitemsize: 1
Cfc4n asked (?!) . After the answer, I summarized how to match the element "no" in Perl and paste it here. The Problem description includes the following text. How do I use regular expressions to match items without the color option? 1 2 3 4 5 6 7 8 item color: red;/item size: 1
Cfc4n asked (?!) . After the answer, I summarized how to match the element "no" in Perl and paste it here.
Problem
Problem description
The following text shows how to use the regular expressionItem without color optionsMatched?
1 2 3 4 5 6 7 8
|
Color: red; Size: 12; Number: 45; Type: good; |
Typical incorrect answer
It is easy for new users to provide such incorrect answers: .*?(?!color).*?
. The starting point is correct: This match is required only when the color does not appear in the target string. In fact, such a regular expression cannot match all ...
. Why?
Perl exclusion matching
Simplest excluded matching
Matching is=~
Which does not match!~
. Write it here and think that in the regular expression=
A regular expression.!
To represent the opposite. For example(?=)
And(?!)
,(?<=)
And(? ,=~
And!~
.
Return to the topic and take a look at the example. If you want to check whether a string contains good, useif($string =~/good/)
, If$string
If there is good, the condition is true; otherwise, the condition is false;
If you want to check whether a string isNoContains good, can be usedif($string !~/good/)
, If$string
If there is no good, the condition is true; otherwise, the condition is false.
This matching test is suitable for searching for a simple pattern in a string of a large segment, and then making two different judgments on the matching result. Although quick and competent, the judgment on complex situations is still cumbersome.
For the question raised at the beginning of the article, you can certainly solve it as follows: search for all ...
And then determine whether a color item exists:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
|
#! /Usr/bin/perl-w
My $ text = < Color: red; Size: 12; Number: 45; Type: good; END My @ result = $ text = ~ M! .*? ! Sg; Foreach $ item (@ result) { If ($ item !~ /Color /) { Print "$ item "; } } |
The output result is:
1 2 3 4 5
|
Size: 12; Number: 45; Type: good; |
Although it is also good, it always finds all possible items "rather than killing them by mistake" and then deletes them one by one. Can we define it at the beginning? What we are looking?Item without colorWhat about it? Exclusive matching is born for this purpose.
Excluded matching
Sorry, I made the word "exclude matching. Other statements may be "negative assertions" or "negative view. The naming of the latter two is from the perspective of the matching process. The naming here is based on the results. Specifically, it is to use(?!...)
And(? As an auxiliary condition to simplify Regular Expressions and quickly find matching that meets requirements.
The usage of these two things is similar. They both mean that there is no certain mode in the current position. The difference is that,(?!...)
Is the right of the current position, and(? Naturally, it refers to the left.
Here is a grand introduction of Anrs translation Tutorials: View 1 and view 2. Read these two articles carefully to thoroughly understand the two concepts of the loop view, which will improve the skill of your regular expression. This article will be based on your understanding of the concept of Surround view.
Let me have a chat. Since "Left" and "right" are both good and understandable, why have you never seen "Left", "right", "Left", and "right ", instead, what are the hard-to-understand statements like "looking forward and looking backward? This is also the question of "Tearing. In my understanding, it may be to take care of the habits of users who write from right to left, such as Ayu. In any case^
To$
Is called forward.
Describes the pattern of the current position (left or right) to help determine whether the regular expression matches. It only describes and does not consume characters. It only assists in judgment and never appears independently. This corresponds^
And$
This is exactly the same.
Example 1
Example.There are many websites similar to fanfou.com. How do I write a regular expression to match a domain name containing fanfou, but TLS is not in. com mode?
Answer:/\bfanfou\.(?!com)[a-z]{2,4}\b/i
. Analyze this regular expression:
- To
\b
Start, define the character boundary;
- Fanfou primary domain names are indispensable;
\.
Match a common dot. Do not use the dot metacharacters here;
(?!com)
Indicates here (fromfanfou.
To the right of the object) cannot contain three consecutive com characters;
[a-z]{2,4}
It indicates two to four Latin letters. Because the TLS of a domain name can contain at least two digits (for example,. au,. us), it can contain a maximum of four digits (for example,. info,. asia );
- The right boundary is equally important, otherwise our previous {2, 4} is in vain;
- I is case-insensitive. This is one of the characteristics of a domain name.
Back to this question
Follow the instructions to create this regular expression step by step.
- The regular expression matches
...
Structure. Therefore, the regular expression uses
Start.
- In
And
The regular expression is difficult. Because,color
It may be any point in this structure. Therefore, it must be specified that the word color is not allowed at any point in the structure. The following points are:(?!color).
. This point is repeated more than 1 time, and the regular expression is written((?!color).)+
. Note that there is a small trap: Do not write(?!color).+
Otherwise, it only describes the leftmost part and does not show color. Write((?!color).)+
The color is not displayed at each point.
- The regular expression is
((?!color).)+?
. To save resources, brackets are usually written as non-capture modes.(?:...)
To ensure that the dot matches the line break, you can specify the s mode or use[\s\S]
In place of the dot metacharacters. The dot number is still used here. Modify the regular expression (?:(?!color).)+?
.
In general, compared with the basic metacharacters, the loop view must be abstracted. However, once you understand and master it, you will find it useful in exact matching and replacement. The above analysis is helpful. If you have a similar question, you are welcome to raise it.
Exclude, lookaround, negate, perl
From: http://iregex.org/blog/negate-match.html