The first comparison system of learning regular expressions, this article in the PHP language as an example to learn.
Basic concepts
Regular expressions = Ordinary characters (such as A-Z) + delimiters ( forward slash (/), hash sign (#), and Inverse (~)) + special characters (called metacharacters);
Matching principle
A simple description, usually starting with the string position 0, attempts to match, if the match successfully stores the substring, if the match fails in one location, then moves back one position, starting from position 1 to re-match. The successful substring is not found until the match succeeds or matches to the last position.
Matching mode
Look at the name, do you think of the design pattern in the singleton mode?
- Greedy mode
In the match and can not match the time, the first match, such as quantifier with * or +
2. Lazy mode
When matching and can not match, priority mismatch, such as the use of quantifiers?
Use
- Lookup: Checks if a string contains a substring, and the substring is a substring that conforms to the regular expression condition, and finally gets all the set of substrings that meet the criteria
- Replace: Replace the last found substring with a specific string
Regular delimiter
Character |
Description |
Note |
/Expression/ |
The bounds of a complete regular expression |
If there is/in the expression, the escape character \ Escape $p = "/ab\//" is required |
#表达式 # |
The bounds of a complete regular expression 2 |
If there is # in the expression, you need to escape the character \ escaped $p = "#ab/\##" |
~ Expression ~ |
The bounds of a complete regular expression 3 |
If there is a ~ in the expression, you need to escape the character \ escaped $p = "~ab/#\~~" |
|-expression | |
The bounds of a complete regular expression 4 |
If there is | In the expression, you need to escape the character \ escaped $p = "|ab/#\~\| |" |
Meta-character Boundary locator
Character |
Description |
Note |
^ Opening, $ end |
Assertion. Indicates that the substring must exactly match the target string. ^ matches the position of the first letter before the beginning of the string, matching the last position of the letter at the end of the string |
See code Example 1 |
\b |
Match a word boundary to match the position between the word and the space |
|
\b |
Matches a non-word boundary, matching the position between a word and a space |
|
Quantifier Meta-character
Character |
Description (all returned to the oldest string to be obtained) |
Note |
* |
0 or more times to match the preceding sub-expression, |
Equivalent to {0,} |
? |
0 or 1 matches the preceding sub-expression |
Equivalent to {0,1} lazy mode just use this |
+ |
1 or more matches preceding sub-expression |
Equivalent to {1,} |
N |
N-Times matches the preceding sub-expression |
|
{N,} |
is greater than or equal to n matches the preceding sub-expression |
|
{, n} |
Less than equals n matches the preceding self-expression |
|
{N,m} |
Minimum n times, maximum m times, matching the preceding self-expression |
|
Ordinary meta characters
Character |
Description |
Note |
\d |
number, equivalent to [0-9] |
|
\d |
Non-numeric, equivalent to [^0-9] |
|
\w |
Match letters, numbers, underscores, equivalent to [a-za-z0-9_] |
|
\w |
Match non-[^a-za-z0-9_] |
|
\s |
matches any whitespace character (including spaces, tabs, page breaks), equivalent to [\n\f\t\r\v] |
|
\s |
Matches any non-whitespace character (including spaces, tabs, page breaks), equivalent to [^\n\f\t\r\v] |
|
\ |
Escape character |
|
. |
Match any character other than line break (\ n) |
|
[] |
Contains an expression that can get a character, with a property of "or", followed by a modifier. Within square brackets, individual characters have different meanings: ^ indicates that the character class is reversed only when it is the first character (in square brackets) -Marker Character Range \ escape Character |
|
() |
Contains a sub-expression to express or require | |
|
| |
Start an optional branch |
|
Grouping meta characters
Character |
Description |
Note |
() |
Enclose an expression in conjunction with another metacharacters |
|
Reference
Character |
Description |
Note |
& |
In the replacement string, the symbol & the contents of the string that represents the entire regular expression |
Only used when replacing |
\ n (n=1,2 ...) |
The transfer number \ n represents the string content that matches the nth bracket of the Soso string in a regular expression |
Find, replace can be used |
The precedence of the meta-characters
Priority level |
Character |
Description |
1 |
\ |
Escape |
2 |
^\W\D, etc. |
|
3 |
(),[] |
|
4 |
*,+,?,{} |
Twice |
5 |
^,$ |
Assertion, Location |
6 |
| |
Group |
Some of the methods that are relevant to PHP
1.preg_match ($pattern, $subject, $matches);
Purpose: The return value is the number of successful matches 0 or 1, after matching to 1 times will stop the search
Parameters: $pattern is a regular expression, $subject is the searched string, $matches optional, stores an array of matching results, $matches [0] contains the entire pattern matching text, $matches [1] is the text that matches the sub-pattern in the first parenthesis. In turn
code example:
$subject = "ABCdef"; $pattern = '/A (. *) (\w) d/';p reg_match ($pattern, $subject, $matches);p Rint_r ($matches);
return Result:
Array
(
[0] = ABCD
[1] = b
[2] = C
)
2.preg_match_all ($pattern, $subject, $matches [, int flags]);
Purpose: Loop to get an array of matching results for a list.
Parameter: $pattern is a regular expression; $subject is the string to be searched;
$matches optional, stores a multidimensional array of matching results, and flags has multiple values. Http://www.360doc.com/content/12/0828/13/10503611_232787142.shtml
Case:flag=1,preg_parttern_order The default value, $matches [0] stores an array of all pattern matches, $matches [1] is the text array that matches the sub-pattern in the first parenthesis, and so on.
Case:flag=2,preg_set_order, $matches [0] stores all arrays of pattern matching in the first parenthesis, $matches [1] for all arrays that match the pattern in parentheses, and so on
Case:flag=3,preg_offset_capture,m
$matches [0] contains the entire pattern-matched text array, $matches [1] is the text array that matches the sub-pattern in the first parenthesis, and so on.
Common examples
$mail = "/[\w|." [email protected]\w+.\w{3}/]; match mailbox, @ front can be. Or letters, numbers, underscores, followed by letters, numbers, and underscores. Plus letters, numbers, strokes.
A brief talk on the regular expression-php as an example