PCRE
There are two different ways to use regular expressions in PHP: PCRE (Perl compatible notation, preg_*) function and POSIX (POSIX extended notation, ereg_*) functions. Luckily, the POSIX family function was deprecated from the PHP 5.3.0.
Regular expressions
Defining character
Frequently used delimiters are forward slashes (/), hash symbols (#), and counter symbols (~). The following examples are patterns that use the legal separator
/foo bar/
#^[^0-9]$#
+php+
%[a-zA-Z0-9_-]%
{this is a pattern}
You can add a pattern modifier after the end separator
Metacharacters
Some characters are endowed with special meanings so that they no longer simply represent themselves, and this special meaning encoded character in the pattern is called 元字符
.
Meta character |
Description |
|
Generally used to escape characters |
^ |
Asserts the starting position of the target (or the beginning of the line in multiline mode) |
$ |
Asserts the end position of the target (or end-of-line in multiple-line mode) |
. |
Matches any character except a newline character (default) |
[ |
Start character class definition |
] |
End character class definition |
|
Start an optional branch |
( |
Start tag for child groups |
) |
End tag of a child group |
? |
As a quantifier, represents 0 or 1 times match. The greedy attribute that is used to change quantifiers after a quantifier. (check quantifiers) |
* |
quantifiers, matching 0 or more times |
+ |
quantifiers, matching 1 or more times |
{ |
Custom quantifier start tag |
} |
Custom quantifier end Tag |
The part of the pattern in parentheses is called a "character class." Only the following available meta characters in a character class
Meta character |
Description |
|
Escape character |
^ |
Indicates that the character class is reversed only when the first character (in square brackets) is used |
- |
Mark a range of characters |
Character class
The content in square brackets is the character class
There are predefined character classes
character class |
Description |
D |
Any decimal digit |
D |
Any non-decimal digit |
H |
Any horizontal whitespace character (since PHP 5.2.4) |
H |
Any non-horizontal whitespace character (since PHP 5.2.4) |
S |
Any whitespace character |
S |
Any non-white-space character |
|
Any vertical whitespace character (since PHP 5.2.4) |
V |
Any non-vertical whitespace character (since PHP 5.2.4) |
W |
Any word character |
W |
Any non-word character |
Atomic
Visible atoms
Such asabc
Invisible atoms
Such as
Quantifiers
quantifiers |
|
* |
Equivalent to {0,} |
+ |
Equivalent to {1,} |
? |
Equivalent to {0,1} |
Assertion
The simple assertion code has, B, A, Z, Z, ^, $
Forward-looking Assertions
Test forward from current position
(?=)
(?!)
w+(?=;)
Match a word followed by a semicolon but the matching result does not include a semicolon
Post-Zhan Assertion
Test backwards from current position
(?<=)
(?
(?用于查找任何前面不是 ”foo” 的 ”bar”
模式修饰符
x
mode modifier |
|
U |
This modifier reverses the "greedy" pattern of quantifiers, making quantifiers default to non greedy |
I |
case insensitive match |
Ignore blank |
s |
The code> dot character matches all characters, including line breaks. Without this modifier, the dot number does not match the newline character |
... |
&NBSP; |
PCRE 函数
preg_filter — 执行一个正则表达式搜索和替换 preg_grep — 返回匹配模式的数组条目 preg_last_error — 返回最后一个PCRE正则执行产生的错误代码 preg_match_all — 执行一个全局正则表达式匹配 preg_match — 执行一个正则表达式匹配 preg_quote — 转义正则表达式字符 preg_replace_callback_array — Perform a regular expression search and replace using callbacks preg_replace_callback — 执行一个正则表达式搜索并且使用一个回调进行替换 preg_replace — 执行一个正则表达式的搜索和替换 preg_split — 通过一个正则表达式分隔字符串