Introduction: JS Regular expression is a major difficulty in JS learning process, complex matching model enough to make the head big, but its complexity and its learning difficulty has also given it a powerful function. This paper starts from the forward forward perspective of JS regular expression and realizes the case of negative matching. This article is suitable for a certain JS regular expression basis of the classmate, if the regular expression does not understand, but also need to learn the basis to observe this negative dafa.
First, the label filter requirements
Do not know that everyone is writing JS have encountered such a situation, when you want to process a string of strings, you need to write a regular expression to match the text content is not XXX. Sounds like a little strange, match is not xxx content, not xxx I match it why Ah, I want what match what not is finished. You don't say, this thing is really useful, whether you have met, anyway, I met. Specific requirements for example: when you receive a string of HTML code, you need to filter this string of HTML code, all the non-<p> tags inside the change to <p>. There must be a lot of students here would like to abandon, "all the labels are changed to <p> then you can change any label to <p>. , and a line of code was shot in the head:
1 var str = ' <div>,<p>
Note that there is a reference "$" in this method, which refers to the 1th grouping of the regular expression, which can be used $n to represent the nth captured reference in a regular expression. For the example above, "(//?)" The meaning of this expression is that the "\ \" character appears 0 or 1 times, and this reference is equivalent to the "\ \" character good match the Big girl, she has made up her mind not to "/" in this lifetime not to marry. So when the match has a "/", the quote will capture it, from now on, yours is Mine, mine is yours, so $ equivalent to "(/?)" The match to the character; if there is no match to the "\ \" character, then this quote will hollow the boudoir, independently through a long night, because of its extremely empty heart, so $ $ is equivalent to "" (that is, empty string).
Here we talk about the concept of reference and capture, because it will be used later. Then again, just that string of regular, not already perfect to achieve the demand it? What negative matches do you study? You crossing don't hurry, and listen to the niche slowly. We all know that the demand for this thing is definitely going to change the beep (??? )。 Now change the demand: when you receive a string of HTML code, you need to filter this string of HTML code, all of the non-<p> or <div> tags are changed to <p>. WTF? What kind of demand is that? That's what I was reacting to. Let's analyze what this requirement is all about, that is, keep <p> and <div> in the original HTML code, and change the other labels to <p> uniformly. Hey... This is not good, just now the string of code seems to be unworkable. So this time can only be used to exclude the law, excluding <p> and <div>, replace the other tags. Then the problem is coming, how to exclude?
Second, the regular forward-looking expressionIn regular expressions there is something called foresight, and some call it a 0- wide assertion:
An expression |
Name |
Describe |
(? =exp) |
Forward Looking |
Match the location after which the expression exp is satisfied |
(?! Exp |
Negative outlook |
Match the location after which the expression exp is not satisfied |
(? <=exp) |
Is backward |
Match the position where the expression was previously met exp (JS not supported ) |
(? <!exp) |
Negative backward |
Match the position of the previous non-satisfied expression exp (JS not supported ) |
Since JS native does not support post-juniper, it is not studied here. Let's take a look at the role of foresight:
1 var str = ' Hello, Hi, I am Hilary. ' ; 2 var reg =/h (? =i)/g; 3 var newstr = str.replace (Reg, "T"); 4 Console.log (NEWSTR); // Hello, Ti, I am tilary.
In this demo we can see the forward-looking role, also the character "H", but only match "H" followed by "I" of "H". The equivalent of a company Reg, at this time there are more than "H" personnel came to apply, but Reg company put forward a hard condition is to master the "I" this skill, so "Hello" is naturally eliminated.
What about the negative outlook? The truth is the same:
1 var str = ' Hello, Hi, I am Hilary. ' ; 2 var reg =/h (?!) i)/g; 3 var newstr = str.replace (Reg, "T"); 4 Console.log (NEWSTR); // Tello, Hi, I am Hilary.
In this demo, we replaced the forward Outlook with a negative outlook. This regular means that the match "H" is not followed by an "I". At this time "Hello" can be successful candidates, because Reg company changed their recruitment conditions, they say "I" This technology will damage the company's corporate culture, so we do not.
Third, forward-looking non-capture
Speaking of which, let's go back to the initial requirement, and let's start with a negative perspective to achieve the first requirement: Replace all non-<p> tags with <p>. the students have just finished learning negative forward, understand the broad and profound JS, the heart of the Dark born secretly happy, pen a wave:
1 var str = ' <div>,<p>,; 2 var reg =/< (\/?) (?! P) >/G; 3 var newstr = str.replace (Reg, "<$1p>"); 4 Console.log (NEWSTR); // <div>,<p>,
What? Why doesn't it work? What about the negation of Dafa? We need to talk about this. One of the features of foresight is that it is a non-capturing group , and what is not a capturing group? Remember the front that is not "\ \" Do not marry the big girl $ $, why people so passionately devoted, is because she has already "/" heart caught up, and the forward-looking is a non-capturing group, that is, you do not capture the others. This means that it cannot be referenced by a reference "\ n" or "$n":
1 var str = ' Hello, Hi, I am Hilary. ' ; 2 var reg =/h (?!) i)/g; 3 var newstr = str.replace (Reg, "t$1"); 4 Console.log (NEWSTR); //t$1ello, Hi, I am Hilary.
Note the output of the statement, as we can see earlier, if the reference is not matched to the specified character, then the empty string "" is displayed, but here is a direct display of the entire reference character "$". This is because forward-looking expressions are not captured at all, and there is no reference without a capture.
Non-capture is a fundamental feature of foresight, and another feature of foresight is that it does not eat characters , meaning that the forward-looking effect is simply to match the characters that meet the forward-looking expressions, rather than the preview itself . That is to say, the preview will not modify the matching location, so I feel that I am obscure, we still have to look at the code bar ︽⊙_⊙︽:
1 var str = ' Hello, Hi, I am handsome Hilary. ' ; 2 var reg =/h (?!) i) e/g; 3 var newstr = str.replace (Reg, "T"); 4 Console.log (NEWSTR); //tllo, Hi, I am handsome Hilary.
Pay attention to the output of the string, the forward-looking role is only to match the forward-looking conditions of the character "H", matching the "Hello" and "handsome" in the H, but at the same time the forward-looking will not eat characters, that is, will not change the position, followed by "H" began to continue to match down, This time the match condition is "e", so "he" in "Hello" matches successfully, and "Ha" in "Handsome" match failed.
H H Hello, Hi, I am handsome Hilary.
Iv. using the forward-looking implementation of label filtering
Since the forward-looking is non-capturing and does not eat characters, then we can now finally complete our needs after knowing these features? Because it does not eat characters, the specific label characters have to be eaten by ourselves:
1 var str = ' <div>,<p>,; 2 var reg =/< (\/?) (?! p|\/p). *?>/G; 3 var newstr = str.replace (Reg, "<$1p>"); 4 Console.log (NEWSTR); //<p>,<p>,<p>,<p>,</p>,</p>,</p>,</p>
Chatted so long, finally solved our first demand, attention in the ". *", although this match is any character, but do not forget, with the front of the negative forward, we match is not followed by the "P" or "/P" character "<".
<Div>,<p>,<H1>,<Span>,</span>,</ H1>,</p>,</div>
Note that a pipe symbol "|" is used here. to match "\/p", although the previous "(/?)" Match Terminator, but keep in mind that the grouping options here cannot be omitted, because the quantifiers here can appear 0 times. Let's imagine if you use the/< (/?) (?! p). *?>/g "to match" </p> "this label, the equivalent of the second match to the"/"when the discovery can be matched, then recorded, and then the"/"forward-looking judgment, but then a" p "then can not match, throw away; The matching character is 0, and then to the "<" forward-looking judgment, where the "<" followed by "/p" instead of "P", so that the successful match, so the label will be replaced, and because the previous grouping to match the character is 0, that is, there is no match to the character, So the subsequent reference is an empty string.
1 var str = ' <div>,<p>,; 2 var reg =/< (\/?) (?! p). *?>/G; 3 var newstr = str.replace (Reg, "<$1p>"); 4 Console.log (NEWSTR); //<p>,<p>,<p>,<p>,</p>,</p>,<p>,</p>
The first filter needs to be completed, then the second filter needs to be completed naturally, at this time, even if there are so five or six tags to keep, we do not have to fear:
1 var str = ' <div>,<p>,; 2 var reg =/< (\/?) (?! P|\/P|DIV|\/DIV). *?>/G; 3 var newstr = str.replace (Reg, "<$1p>"); 4 Console.log (NEWSTR); //<div>,<p>,<p>,<p>,</p>,</p>,</p>,</div>
V. Summary
JS forward forward is only part of the regular expression, not quite so much of the mystery of this part.
In using forward looking, we need to be aware of:
- Forward-looking is non-capturing: its characteristics cannot be referenced.
- Outlook does not consume characters: the forward-looking only matches the characters that meet the forward-looking expression, not the one that matches itself.
In other words, is our demand here? Is it really over? Do the students feel that they are enjoying themselves? Some students think it may be almost, need to digest a period of time, but there is absolutely a part of the students are not fun at all, it doesn't matter, finally left everyone together study questions, as of the end of my writing this blog, I have not come up with a solution to it (? _)?
The requirements are as follows: when you receive a string of HTML code, you need to filter this string of HTML code, change all of the non-<p> or <div> tags inside to <p>, and keep the style of all tags, requiring only one regular expression, for example:
// input var input = ' <div class= ' beautiful ">,<p class=" provocative ">,; // Output var output = ' <div class= ' beautiful ">,<p class=" provocative ">,<p class=" attractive "id=" Header ">,<p class=" sexy ">,</p>,</p>,</p>,</div>";
If you have a good solution, you are welcome to comment in the comments section, we learn together.
Reference documents:
devinran--, "Love and death--regular and the browser loves hate"
Barret lee--"Advanced Regular expression"
JS Regular Expression Negative match (forward-looking)