Objectives
Within 30 minutes, you can understand what a regular expression is and have some basic knowledge about it, so that you can use it in your own program or webpage.
How to use this tutorial
The most important thing is-Please give it to me30 minutesIf you have no experience using regular expressions, do not trySecondsGetting started-unless you are a superman
Don't be intimidated by the complex expressions below. If you follow me step by step, you will find that regular expressions are not as difficult as you think. Of course, if you find that you understand a lot and can hardly remember anything after reading this tutorial, it is also normal-I think, after reading this tutorial, people who have never touched on regular expressions can remember more than 80% of the syntaxes mentioned. Here is just to let you understand the basic principles. In the future, you will need to practice more and use more to master regular expressions.
In addition to getting started, this article also attempts to become a reference manual for regular expression syntax that can be used in daily work. As far as the author's experience is concerned, this goal is still well accomplished-you see, I can't write down everything myself, can I?
Clear format text format Conventions: Terminology metacharacters/syntax format part of the regular expression Regular Expression (used for analysis) to match the source string to a regular expression or a part of the description
Hidden Side Note: There are some comments on the right side of this article, mainly used to provide some relevant information, or to explain some basic concepts to readers without a programmer background, which can be ignored.
What is a regular expression?
Character is the most basic unit for computer software to process text. It may be letters, numbers, punctuation marks, spaces, line breaks, Chinese characters, and so on. A string is a sequence of 0 or more characters. Text is text, a string. When a string matches a regular expression, it usually means that some (or several parts) of the string can satisfy the conditions given by the expression.
When writing a program or webpage that processes strings, it is often necessary to find strings that meet certain complex rules. Regular Expressions are tools used to describe these rules. In other words, a regular expression is the code that records text rules.
You may have used the wildcard (wildcard) for file search in Windows/DOS, that is, * and ?. If you want to find all the Word documents in a directory, you will search for *. Doc. Here, * is interpreted as any string. Like wildcards, regular expressions are also a tool for text matching, but they can more accurately describe your needs than wildcards-of course, the cost is more complex-for example, you can write a regular expression to search for all numbers starting with 0, followed by 2-3 numbers, and then a hyphen "-", it is a string of 7 or 8 digits (such as 010-12345678 or 0376-7654321 ).
Getting started
The best way to learn regular expressions is to start with the example, understand the example, and then modify and experiment the example. The following are some simple examples and detailed descriptions of them.
If you search for hi in an English novel, you can use the regular expression hi.
This is almost the simplest regular expression. It can precisely match a string consisting of two characters, the first character is h, and the last one is I. Generally, the regular expression processing tool provides a case-insensitive option. If this option is selected, it can match any of the four cases: Hi, hi, hi, and HI.
Unfortunately, many words contain the two consecutive characters hi, such as him, history, and high. If you use hi for search, the hi here will also be found. To find the word "hi" accurately, we should use "BHI" B.
"B is a special code specified by a regular expression (well, some people call it metacharacter). It represents the start or end of a word, that is, the boundary of a word. Although English words are generally separated by spaces, punctuation marks, or line breaks, "B does not match any of these word separators.Match only one location.
To be more precise, "B matches the position where the first and last characters of B are incomplete (one is, one is not or does not exist)" W.
If you are looking for a Lucy not far behind hi, you should use "BHI" B. * "blucy" B.
Here, it is another metacharacters that match any character except the line break. * It is also a metacharacter, but it does not represent a character, nor a position, but a number-It specifies * the content of the front edge can be repeatedly displayed any time in a row to match the entire expression. Therefore,. * When connected, it means that any number of characters do not contain line breaks. Now "BHI" B. * The Meaning of "blucy" B is obvious: first a word hi, then any character (but not a line break), and finally Lucy.
The linefeed is a character of '"N', ASCII encoded as 10 (hexadecimal 0x0.
If other metacharacters are used at the same time, we can construct a more powerful regular expression. For example:
0 "D" D-"D" d matches a string starting with 0, followed by two numbers, then there is a hyphen "-" and the last eight digits (that is, the Chinese phone number. Of course, this example can only match a three-digit area code ).
Here, "D" is a new metacharacters that match a digit (0, or 1, or 2, or ......). -It is not a metacharacter. It only matches itself-a hyphen or a minus sign.
To avoid so many annoying repetitions, we can also write this expression: 0 "d {2}-" d {8 }. Here, "{2} ({8}) after D indicates that D must be repeated twice (eight times ).
Test Regular Expression
Other available test tools:
- Regexbuddy
- Javascript Regular Expression Online Testing Tool
If you don't think regular expressions are hard to read and write, you can either be a genius or you are not a human on Earth. The syntax of a regular expression is a headache, even for those who often use it. Because it is difficult to read/write and error-prone, it is necessary to find a tool to test the regular expression.
Because the details of regular expressions vary in different environments, this tutorial introduces Microsoft.. NET Framework 2.0.. Net tool RegEx tester. First, make sure that. NET Framework 2.0 is installed, and then download RegEx tester. This is a green software. After the download, open the compressed package and run regextester.exe directly.
The following is the RegEx tester runtime:
Metacharacters
Now you know several useful metacharacters, such as "B ,., *, and "D. there are more metacharacters in the regular expression, such as "s matching any blank space, including spaces, tabs, line breaks, and Chinese fullwidth spaces. "W matches letters, numbers, underscores, or Chinese characters.
Special processing of Chinese/Chinese characters is supported by the Regular Expression Engine provided by. net. For details about other environments, see relevant documents.
Here are more examples:
"Ba" W * "B matches a word that starts with the letter A. It starts with a word (" B) and then, then there are any number of letters or numbers ("W *), and finally the end of the Word (" B ).
Well, now let's talk about the meaning of words in Regular Expressions: more than one continuous "W. Yes, it does not have to do with thousands of things with the same name when learning English.
"D + matches one or more consecutive numbers. Here, the "+" is similar to the "*" metacharacters. The difference is that * matches any number of times (which may be 0 times), and "+" matches one or more times.
"B" W {6} "B matches exactly 6 letters/numbers.
Table 1. Common metacharacters
Code |
Description |
. |
Match any character except linefeed |
"W |
Match letters, numbers, underscores, or Chinese Characters |
"S |
Match any blank space character |
"D |
Matching number |
"B |
Start or end of a matching word |
^ |
Start of matching string |
$ |
End of matching string |
The metacharacters ^ (the symbol on the same key position as the number 6) and $ both match a position, which is somewhat similar to "B. ^ Match the start of the string you want to search for, and $ match the end. These two codes are very useful when verifying the entered content. For example, if a website requires that the QQ number you enter must be 5 to 12 digits, you can use: ^ "d {5, 12} $.
The {5, 12} Here is similar to the {2} mentioned above, except that the {2} match can only be repeated twice, {5, 12} indicates that the number of repetitions cannot be less than 5, but not more than 12. Otherwise, none of them match.
Because ^ and $ are used, the entire input string must be matched with "d {5, 12}. That is to say, the entire input must be 5 to 12 digits, therefore, if the entered QQ number can match this regular expression, it will meet the requirements.
Similar to case-insensitive options, some regular expression processing tools also have an option to process multiple rows. If this option is selected, the meaning of ^ and $ is changed to the start and end of the matching row.
Character escape
If you want to find the metacharacters themselves, for example, if you want to search for. Or *, you may encounter a problem: You cannot specify them because they will be interpreted as other meanings. In this case, you must use "to cancel the special meanings of these characters. Therefore, you should use ". And "*. Of course, you need to search for "itself "".
For example, unibetter ". com matches unibetter.com, C:" windows matches C: "windows.
Repeated
You have read the above matching methods *, +, {2}, {5, 12. The following are all the qualifiers in the regular expression (a specified number of codes, such as *, {5, 12 ):
Table 2. Common delimiters
Code/syntax |
Description |
* |
Repeated zero or more times |
+ |
Repeat once or more times |
? |
Zero or one repetition |
{N} |
Repeated n times |
{N ,} |
Repeat N or more times |
{N, m} |
Repeat n to m times |
The following are examples of repeated use:
Windows "d + matches one or more numbers after windows
^ "W + matches the first word of a row (or the first word of the entire string. The option setting must be used to specify the meaning of the match)
Character class
To search for numbers, letters, or numbers, the blank space is very simple, because there are already metacharacters corresponding to these character sets, but what should you do if you want to match character sets that do not have predefined metacharacters (such as vowels A, E, I, O, u?
You just need to list them in square brackets. For example, [aeiou] matches any English vowel, [.?!] Match punctuation marks (. Or? Or !).
We can also easily specify a character range. For example, [0-9] indicates that the meaning is exactly the same as "D": a digit; similarly, [a-z0-9A-Z _] is equivalent to "W (if only English is considered ).
The following is a more complex expression :"(? 0 "d {2} [)-]? "D {8 }.
"(" And ")" are also metacharacters, which will be mentioned later in the grouping section. Therefore, escape is required here.
This expression can match phone numbers in several formats, such as (010) 88886666, 022-22334455, or 02912345678. Let's analyze it: first, it is an escape character "(, it can appear 0 times or 1 time (?), Then there is a 0 followed by two numbers ("d {2}), followed by one of),-, or space. It appears once or does not appear (?), The last eight digits are ("d {8 }).
Branch Condition
Unfortunately, the expression just now can also match the "Incorrect" format of 010) 12345678 or (022-87654321. To solve this problem, we need to use the branch condition. The branch condition in a regular expression refers to several rules. If any rule is satisfied, it should be regarded as a match. The specific method is to use | to separate different rules. Can't you understand? It doesn't matter. Let's look at the example:
0 "d {2}-" d {8} | 0 "d {3}-" d {7} can match two phone numbers separated by a hyphen: one is a three-digit area code, an eight-digit Local Code (for example, 010-12345678), a four-digit area code, and a seven-digit local code (0376-2233445 ).
"(0" d {2} ") [-]? "D {8} | 0" d {2} [-]? "D {8}: this expression matches the phone number of the three-digit area code. The area code can be enclosed in parentheses or not. The area code can be separated by a hyphen or space, or there is no interval. You can try to use the branch condition to extend this expression to a four-digit area code.
"D {5}-" d {4} | "d {5} is used to match the zip code of the United States. The U.S. Postal Code uses five digits or nine digits separated by a hyphen. This example is given because it indicates a problem:Note the order of each condition when using the branching condition.. If you change it to "d {5} |" d {5}-"d {4, then, it will only match the 5-digit ZIP code (and the first 5-digit of the 9-digit ZIP code ). The reason is that, when matching a branch condition, each condition will be tested from left to right. If a branch is satisfied, other conditions will not be managed.
Group
We have already mentioned how to repeat a single character (simply add a qualifier after the character); but what if you want to repeat multiple characters? You can use parentheses to indicate the subexpression (also called grouping), and then you can specify the number of repetitions of this subexpression, you can also perform other operations on the subexpression (which will be introduced later ).
("D {1, 3}".) {3} "d {1, 3} is a simple IP address matching expression. To understand this expression, analyze it in the following order: "d {1, 3} matches 1 to 3 digits (" d {1, 3 }".) {3} matches three digits with an English ending (this group is used as a whole), repeats three times, and adds one to three digits ("d {1, 3 }).
Each number in the IP address cannot exceed 255. Never be fooled by the scriptwriter in the third quarter of "24...
Unfortunately, it will also match an impossible IP address such as 256.300.888.999. If arithmetic comparison can be used, this problem may be solved simply. However, regular expressions do not provide any mathematical functions. Therefore, you can only use lengthy grouping and selection, character class to describe a correct IP Address: (2 [0-4] "d | 25 [0-5] | [01]? "D" d ?) ".) {3} (2 [0-4]" d | 25 [0-5] | [01]? "D" d ?).
The key to understanding this expression is to understand 2 [0-4] "d | 25 [0-5] | [01]? "D" d ?, I will not elaborate on it here. You should be able to analyze its meaning.
Antsense
Sometimes you need to find characters that do not belong to a simple character class. For example, if you want to search for any character except a number, you need to use the negative sense:
Table 3. Commonly Used negative code
Code/syntax |
Description |
"W |
Match any character that is not a letter, number, underline, or Chinese Character |
"S |
Match any character that is not a blank character |
"D |
Match any non-numeric characters |
"B |
Match is not the start or end of a word |
[^ X] |
Match any character except x |
[^ Aeiou] |
Match any character except aeiou |
Example: "s + matches strings that do not contain blank characters.
<A [^>] +> match a string prefixed with a enclosed in angle brackets.
Backward reference
After a subexpression is specified using parentheses,Match the text of this subexpression(That is, the content captured by this group) can be further processed in expressions or other programs. By default, each group will automatically have a group number. The rule is: from left to right, marked by the left parentheses of the group, and the first group number that appears is 1, the second is 2, and so on.
Backward reference is used to repeatedly search text matched by the previous Group. For example, "1 indicates the text matched by Group 1. Hard to understand? See the example:
"B (" W +) "B" s + "1" B can be used to match duplicate words, such as go or kittykitty. This expression is a word, that is, more than one letter or number ("B (" W +) "B) between the start and end of a word ), the word is captured in a group numbered 1, followed by one or several blank characters ("s + ), finally, the content captured in group 1 (that is, the matched word) ("1 ).
You can also specify the group name of the subexpression. To specify the group name of a subexpression, use the following syntax :(? <Word> "W +) (or you can change the angle brackets :(? 'Word' "W +), so that the Group name of" W + is specified as word. To reverse reference the content captured by this group, you can use "k <word>, so the previous example can also be written as follows:" B (? <Word> "W +)" B "s +" k <word> "B.
When parentheses are used, there are many syntax for specific purposes. The most common ones are listed below:
Table 4. Common grouping syntax
Category |
Code/syntax |
Description |
Capture |
(Exp) |
Match exp and capture text to automatically named group |
(? <Name> exp) |
Match exp and capture the text to the group named name. You can also write (? 'Name' exp) |
(? : Exp) |
Matches exp, does not capture matched text, and does not assign group numbers to this group |
Assertion with Zero Width |
(? = Exp) |
Match the position before exp |
(? <= Exp) |
Match position after exp |
(?! Exp) |
The position behind matching is not exp |
(? <! Exp) |
Match the position that is not exp |
Note |
(? # Comment) |
This type of grouping does not affect the processing of regular expressions. It is used to provide comments for reading. |
We have discussed the first two syntaxes. Third (? : Exp) does not change the processing method of the regular expression, but the content of such a group match will not be captured into a group as in the first two methods, nor will it have a group number.
Assertion with Zero Width
Do Earth people think these terms are too complicated and difficult to remember? I am also like you. You just need to know what it is, so let it go! "Unknown, the beginning of everything ..."
The following four items are used to search for things before or after some content (but not including the content), that is, they are used to specify a location like "B, ^, $, this position should satisfy certain conditions (that is, assertion), so they are also called assertion with zero width. We 'd better illustrate it with examples:
Assertions are used to declare a fact that should be true. In a regular expression, matching continues only when the assertions are true.
(? = Exp) is also called a zero-width positive prediction predicate. It asserted that the position where it appears can match the expression exp. For example, "B" W + (? = Ing "B), matching the front part of the word ending with ing (except for the ing part), such as searching for I'm singing while you're dancing. it will match sing and danc.
(? <= Exp) is also called the zero-width positive review and then asserted that it can match the expression exp in front of its own position. For example (? <= "BRE)" W + "B will match the second half of the word starting with RE (Except re). For example, it matches ading when searching for reading a book.
If you want to add a comma (, of course, from the right side) to each of the three digits in a long number, you can search for the parts that need to be added with a comma :((? <= "D)" d {3}) * "B. When it is used to search for 1234567890, the result is 234567890.
The following example uses both assertions :(? <= "S)" d + (? = "S) match the number separated by a blank space (emphasize again, do not include these blank spaces ).
Assertion with negative Zero Width
We mentioned how to findNot a character or not in a character class). But if we just wantMake sure that a character does not appear, but does not want to match itWhat should I do? For example, if we want to find such a word, which contains the Letter Q, but Q is not followed by the letter U, we can try this:
"B" W * Q [^ u] "W *" B matching includesQ is not followed by the letter U. But if you do more tests (or if you are sharp enough, you can simply observe them), you will find that if q appears at the end of a word, likeIraq,BenQ. This is because [^ u] Always matches one character, so if Q is the last character of a word, the [^ u] Following will match the word separator (which may be a space, a full stop or something else) after Q, and the "W *" B following will match the next word, so "B" W * Q [^ u] "W *" B can match the entire iraqfighting. The negative zero-width assertion can solve this problem, because it only matches one location and does notConsumptionAny character. Now, we can solve this problem like this: "B" W * q (?! U) "W *" B.
0-width negative prediction predicate (?! Exp), asserted that the position is not followed by the expression exp. Example: "d {3 }(?! "D) match three digits, and the three digits cannot be followed by digits." B ((?! ABC) "W) +" B match words that do not contain consecutive strings ABC.
Similarly, we can use (? <! Exp), zero-width positive review, and then assertion to assert that the front of this position cannot match the expression exp :(? <! [A-Z]) "d {7} matches the first seven digits that are not lowercase letters.
Analyze the expressions in detail (? <= <("W +)> ).*(? = <"/" 1>), this expression can best represent the true purpose of a zero-width assertion.
A more complex example :(? <= <("W +)> ).*(? = <"/" 1>) matches the content in the simple HTML Tag that does not contain the attribute. (<? ("W +)>) specifies the prefix: The word enclosed by Angle brackets (for example, <B>), and then. * (any string), followed by a suffix (? = <"/" 1> ). Pay attention to the "/" in the suffix, which uses the character escape mentioned above; "1 is a reverse reference, which references the first group captured, the previous (" W +) if the prefix is <B>, the suffix is </B>. The entire expression matches the content between <B> and </B> (remind me again, excluding the prefix and suffix itself ).
Note
Another use of parentheses is through the syntax (? # Comment) to include comments. Example: 2 [0-4] "d (? #200-249) | 25 [0-5] (? #250-255) | [01]? "D" d? (? #0-199 ).
To include comments, it is best to enable the "blank characters in ignore mode" option. In this way, spaces, tabs, and line breaks can be added when an expression is written, which will be ignored in actual use. After this option is enabled, all the text that ends at the end of the line after # is ignored as a comment. For example, we can write the previous expression as follows:
(? <= # Prefix of the text to be matched
<("W +)> # search for letters or numbers enclosed in angle brackets (html/XML tags)
) # End with the prefix
. * # Match any text
(? = # The suffix of the text to be matched
<"/" 1> # search for the content enclosed by Angle brackets: the front is a "/", followed by the previously captured tag
) # End of suffix
Greed and laziness
When a regular expression contains a qualifier that can accept duplicates, the common behavior is (when the entire expression can be matched) matching.As many as possible. Consider this expression: A. * B, which will match the longest string starting with a and ending with B. If you use it to search for aabab, it will match the entire string aabab. This is called greedy matching.
Sometimes, we need to be more lazy, that is, matching.As few as possible. All the qualifiers given above can be converted to the lazy match mode, as long as a question mark is added after it ?. This way .*? This means to match any number of duplicates, but use the minimum number of duplicates if the entire match is successful. Now let's look at the lazy version example:
A .*? B matches the string that is shortest, starts with a, and ends with B. If it is applied to aabab, it will match AAB (first to third character) and AB (fourth to fifth character ).
Why is the first match AAB (the first to the third character) rather than AB (the second to the third character )? Simply put, because a regular expression has another rule, it has a higher priority than a lazy/greedy rule: the first match to start has the highest priority-the match that begins earliest wins.
Table 5. Lazy delimiters
Code/syntax |
Description |
*? |
Repeat any time, but as few as possible |
+? |
Repeat once or more times, but as few as possible |
?? |
Repeated 0 or 1 times, but as few as possible |
{N, m }? |
Repeat n to m times, but as few as possible |
{N ,}? |
Repeated more than N times, but as few as possible |
Processing options
In C #, you can use the RegEx (string, regexoptions) constructor to set the processing options of regular expressions. For example, RegEx = new RegEx ("" ba "W {6}" B ", regexoptions. ignorecase );
The preceding describes several options, such as case-insensitive and multi-row processing. These options can be used to change the way regular expressions are processed. Below are the regular expression options commonly used in. Net:
Table 6. Common processing options
Name |
Description |
Ignorecase (Case Insensitive) |
Matching is case insensitive. |
Multiline (multiline Mode) |
Change the meaning of ^ and $ so that they match the beginning and end of a row, not just the beginning and end of the entire string. (In this mode, the exact meaning of $ is: match the position before N and the position before the end of the string .) |
Singleline (single row Mode) |
Change the meaning of. To match each character (including the linefeed "N ). |
Ignorepatternwhitespace (ignore blank space) |
Ignore non-escape spaces in the expression and enable annotation marked. |
Righttoleft (search from right to left) |
Match from right to left rather than from left to right. |
Explicitcapture (explicit capture) |
Only explicitly named groups are captured. |
Ecmascript (JavaScript compatibility mode) |
Make the expression Behavior consistent with its behavior in JavaScript. |
A frequently asked question is: Can I only use one of the multiple-row mode and single-row mode at the same time? The answer is: no. There is no relationship between the two options except that their names are similar (so confusing.
Balanced group/recursive match
The balanced group syntax described here is supported by. NET Framework. Other languages/libraries do not necessarily support this function, or different syntaxes are required to support this function.
Sometimes we need to match a nested hierarchical structure such as (100*(50 + 15), then we simply use "(. + ") it will only match the content between the leftmost left brace and rightmost right brace (here we are discussing the greedy pattern, and the lazy pattern also has the following problems ). If the numbers of left and right brackets in the original string are not the same, for example (5/(3 + 2 ))), then the numbers in our matching results are not equal. Is there a way to match the longest pair of brackets in such a string?
To avoid (and "(confuse your brain completely, we should replace parentheses with Angle brackets. Now our question is, how can we capture the content in the longest pair angle brackets in a string like XX <AA <BBB> AA> YY?
The following syntax structure is required:
- (? 'Group') Name the captured content as a group and press it into the stack)
- (? '-Group') from the stack, the capture content named "group" pushed into the stack is displayed. If the stack is empty, the matching of the group fails.
- (? (Group) Yes | no) if the capture content named group exists on the stack, continue to match the expression of the yes part; otherwise, continue to match the no part.
- (?!) Assertion with Zero Width and negative direction, attempts to match always fail because there is no suffix expression
If you are not a programmer (or you claim to be a programmer but do not know what a stack is), you can understand the above three syntaxes: the first is to write a "group" on the blackboard, the second is to erase a "group" from the blackboard, and the third is to see whether "group" is written on the blackboard ", if yes, continue to match the yes part; otherwise, the no part is matched.
What we need to do is press a "open" button every time we encounter a left bracket, and each right bracket is displayed, at the end, let's see if the stack is empty. If it is not empty, it means that there are more left brackets than right brackets, and the matching should fail. The Regular Expression Engine will backtrack (discard the first or last character) and try to match the entire expression.
<# Left parenthesis of the outermost layer
[^ <>] * # The left brackets behind the outermost layer are not the content of the brackets.
(
(
(? 'Open' <) # open it on the blackboard when you encounter a left bracket"
[^ <>] * # Match the content not enclosed by brackets
) +
(
(? '-Open'>) # Run the right parenthesis to erase an "open"
[^ <>] * # Match the content not enclosed by brackets
) +
)*
(? (Open )(?!)) # In front of the outermost right parenthesis, check whether there is any "open" on the blackboard that has not been erased. If there are still, the match fails.
> # Outer right brackets
The most common application of a balancing group is to match HTML. The following example can match nested <div> labels: <Div [^>] *> [^ <>] * (? 'Open' <Div [^>] *>) [^ <>] *) + ((? '-Open' </div>) [^ <>] *) + )*(? (Open )(?!)) </Div>.
Nothing to mention
I have already described a large number of elements for constructing regular expressions, and there are some things I have not mentioned. The following is a list of Unmentioned elements, including syntax and simple description. You can find more detailed references on the Internet to learn about them-when you need them. If you have installed the msdn library, you can also find detailed documentation on Regular Expressions Under. net.
Table 7. Syntax not discussed in detail
Code/syntax |
Description |
" |
Alarm character (print it to the computer) |
"B |
It is usually the word boundary, but if it is used in the character class, it indicates the return. |
"T |
Tab, Tab |
"R |
Enter |
"V |
Vertical Tab |
"F |
Page feed |
"N |
Line Break |
"E |
Escape |
"0nn |
The octal character of the ASCII code is NN. |
"Xnn |
Character of the hexadecimal code NN in ASCII code |
"Unnnn |
Character of the hexadecimal code in Unicode code that is NNNN |
"Cn |
ASCII control characters. For example, "cc stands for Ctrl + c |
" |
String (similar to ^, but not affected by the option of multi-line processing) |
"Z |
End or end of a string (not affected by the option of processing multiple rows) |
"Z |
End of a string (similar to $, but not affected by the option of processing multiple rows) |
"G |
Start of the current search |
"P {name} |
The name of a character class in UNICODE, such as "P {isgreek} |
(?> Exp) |
Greedy subexpression |
(? <X>-<Y> exp) |
Balance Group |
(? Im-NSX: exp) |
Change the processing option in the subexpression exp. |
(? Im-NSX) |
Is the partial change processing option after the expression |
(? (Exp) Yes | No) |
Use exp as a positive assertion with Zero Width. If this position can match, use yes as the expression of this group; otherwise, use no |
(? (Exp) Yes) |
Same as above, only use an empty expression as no |
(? (Name) Yes | No) |
If the content is captured by a group named name, use yes as the expression; otherwise, use no |
(? (Name) Yes) |
Same as above, only use an empty expression as no |