Since PHP 4.4.0 and 5.1.0, three additional escape sequences are used to match common character types when UTF-8 mode is selected. They are:
\P{XX}
A character with the attribute xx
\P{XX}
A character with no attribute xx
\x
An extended Unicode character
The attribute name represented by the above xx is used to restrict the usual category properties of Unicode. Each character has one such determined property, specified by two abbreviated letters. In order to be compatible with Perl, you can add a ^ to the left curly brace {After ^ to indicate inverse. For example: \p{^lu} is equivalent to \p{lu}.
If you specify only one letter through \p or \p, it contains all attributes that begin with the letter. In this case, the escape sequence of curly braces is optional.
\P{L}
\pl
Specifying a case insensitive match has no effect on these escape sequences, for example, \p{lu} always matches uppercase letters.
The Unicode character set is defined in specific text. Use a literal name to match one of these character sets. For example:
\p{greek}
\p{han}
is not in the determined text is concentrated to Common.
\x escapes matches any number of Unicode characters. \x equivalent to (? >\pm\pm*)
That is, it matches a character without a "mark" attribute, followed by any number of characters by the "mark" attribute. And think of this sequence as an atomic group (see below). A typical character with a "Mark" attribute is a grave accent that affects the preceding character.
It is not fast to match characters with Unicode attributes, because PCRE needs to search for a data structure that contains more than 15000 characters. This is why the traditional escape sequence \d, \w, and not the Unicode attribute are used in pcre.