To put aside a few questions, it is not so easy to go to the advanced level. It just takes a long time to stay in the zero-degree assertions,
Then, it took a lot of time to write examples by myself, but fortunately, I learned a lot about regular expressions.
The understanding of expressions is also further improved.
If you have any suggestions, please leave a message.
Bytes -------------------------------------------------------------------------------------
Group
Grouping is simple. To put it bluntly, write a simple example: �� (/d {1, 3}/.) {3}/d {1, 3}. Match the IP address. First, it must be noted.
It is not standardized, but here is a description. We will parse it in order, and/d {} matches 1 to 3 digits.
Followed by {3} matching three digits plus. Repeat three times, and finally add a one to three digits. This is the group,
That's simple.
Backward reference
According to my understanding, if a parentheses is used to specify an expression, such as (/d {}), it automatically has a group number, from left to right.
, Marked by the left parentheses of the Group. The Group Number of the group is 1, 2, 3,..., and the back reference is used
Repeat the text that matches the previous Group (expression in parentheses. For example,/1 indicates the text matched by Group 1.
To understand? Well, it's a bit hard to understand. It's the same as I did at the beginning. Don't worry. Let's look at an example:
/B (/W +)/B/S +/1/B can be used to match repeated words, such as kitty. First, a word, then
One or more blank characters, followed by the matched word (/1 ).
You can also specify the group name of the subexpression. Syntax :(? <Word>/W +) (or replace <>,
/W + the group name of this group is specified as word. To reverse reference this group, you can use/k <word>. Therefore, the previous example
You can also write:/B (? <Word>/W +)/B/S +/K <word>/B.
(Exp) Match exp and capture the text to the automatically named group
(? <Name> exp) matches exp and captures the text to a group named name. It can also be written (? 'Name' exp)
(? : Exp) matches exp, does not capture the matched text, and does not assign group numbers to this group.
Assertion with Zero Width
(? = Exp) match the position before exp
(? <= Exp) match the position behind exp
(?! Exp) match the position of the root instead of the exp
(? <! Exp) match the position not above exp
I 'd like to give it a simple explanation, but I may not quite understand it. Let's take a look at the example later. It is easy to understand. The key is its
The names are too stubborn and difficult. In short, they are also used to specify a location like/B, ^, and $.
The same is that the position here must satisfy the exp condition (that is, the assertions). For example:
/B/W + (? = Ing/B), matching the first part of the word ending with ing (except ing), such as string: I'm singing
While you're dancing. It will match sing and danc.
(? <=/BRE)/W +/B, matching the second half of the word starting with RE (Except re), such as string: reading a book.
It will match ading. ((? <=/D)/d {3}) */B. The result of 1234567890 is 234567890.
(? <=/S)/d + (? =/S), matching the number separated by a blank character.
/D {3 }(?! /D), matches three digits, and the three digits are not followed by digits. /B ((?! ABC)/W) +/B, matching
A word that does not contain a consecutive string ABC.
(? <! [A-Z])/d {7}, match the first seven digits not lowercase letters.
(? <= <(/W +)> ).*(? = <// 1>). Match the content in the simple HTML Tag that does not contain the attribute. (<? (/W +)>)
Prefix specified: A <> enclosed word (such as <B>), followed by any. * string, followed by a suffix (? = <// 1> ).
/1 is a reverse reference. It references the first group captured, that is, the matching content of (/W +). If the prefix is <B>,
The suffix is </B>. The entire expression matches the content between <B> and </B>.
Assertions should be easy to understand now. But after I think I understand it, I want to write a tag that matches the HTML Tag.
You can refer to the above expression to write it, but you cannot run it. I thought it was wrong.
I did not find the cause after searching for a long time, but then I asked the experts to know: If I use assertions in PHP regular expressions, I cannot
Fuzzy symbols such as., *, and + are displayed. Otherwise, compilation errors may occur. If you want to use them, you must specify the symbols to be matched.
Label. However, during my testing, it is recommended that you obtain the content in the tag, or do not use assertions.
The dead are just common, such as: <font [^>] *> (.*?) <// Font>.
Greed and laziness
This is relatively simple. Remember one sentence: the first matching has the highest priority, which is better than the lazy/greedy rules.
Higher level.
*? Repeat any time, but as few as possible
+? Repeat once or multiple times, but as few as possible
?? Repeated 0 or 1 times, but as few as possible
{N, m }? Repeat n to m times, but as few as possible
{N ,}? Repeated more than N times, but as few as possible
I am so tired today that I have written so much. I hope it will be helpful to everyone. Let's wait for the rest to continue.
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.
A Free Trial That Lets You Build Big!
Start building with 50+ products and up to 12 months usage for Elastic Compute Service