delimitation Symbol :
A variety of can, commonly used for//
Atoms :
The smallest matching unit (placed in the delimiter), in a regular expression, at least one atom
1, print characters (A-Z a-z 0-9 [email protected]#$%^&* () _+ ... ) and nonprinting characters
2, representing a class of characters
\d: Any number [0-9]
\d: Any non-numeric [^0-9]
\w: Any word A-Z 0-9 _ [a-za-z0-9_]
\w: Any one non-word [^a-za-z0-9_]
\s: Represents any one blank [\t\n\f\v]
\s: Represents any non-blank [^\t\n\f\v]
3, Custom Atomic table
[FWS3]: Fws3 any one character
[^1-9a-z]: Any character not 1-9 A-Z
[2-9x]:2-9 or any character of X
4, Dot (.) can represent any one character
Meta characters :
is not used alone, it is used to extend and qualify atoms (written in the bounding symbol)
* Used to modify the atoms in front of it can appear 0 or more (any time) {0}
+ Used to decorate the atom in front of it appears 1 or more times {1}
? Used to modify the previous atom to appear 0 or 1 times {0,1}
{n} is used to decorate the atoms in front of it to appear n times
{n,m} is used to modify the atoms in front of it to appear n to m times, including N and M
{n,} is used to decorate the atom in front of it at least n times, including n
| Is or is the relationship that represents the atoms on either side of it, as long as one appears on it, but | The priority level is the lowest
^ or \a indicates what must be started, this must be written at the front of the regular expression
$ or \z indicates what must end, this must be written on the last side of the regular expression
\b Word boundaries
\b is not part of the word boundary
pattern modifier (single character)
1, the pattern modifier is written in the delimitation symbol, on the right "/go*gle/i"
2, pattern correction symbol, a character is a function, can be combined with
Role:
Fix the interpretation of regular expressions, or extend the functionality of regular expressions
I: Case insensitive
S: atoms. Can match line break (\ r)
x: Remove whitespace from regular expressions
U: Let atoms. Become not greedy (in the. Back plus?) You can also cancel greedy mode, i.e.. *? or. +?)
Other symbols
() (brackets)
1, change the priority level
2, turn small atoms into large atoms
3, sub-mode, the entire expression is a large pattern, and the parentheses are each independent sub-pattern
4, Reverse reference
$text = "2014-03-22";
$reg = '/\d{4} (-|\/) \d{2}\1\d{2}/';
\1 refers to the pattern in the preceding parentheses, that is, the first pattern (?: XXX) causes the parentheses to lose the 3,4 function
\ (Escape character)
1, you can turn meaningful characters into meaningless atomic characters \^ \. \+ \ "\" \?
2, you can turn meaningless characters into meaningful atoms \ t \cx \f \ r \v
3, plus \ also meaningless characters, add not to add \ No difference \_ \q
Instance
Match URLs
$reg = "/(https?| FTPs?) \:\/\/(Www|mail|bbs) \. (.+?) \. (com|cn|net) ([\w\-\/\.\=\?\&\%]*)?/I ";
Match mailbox
$reg = "/\w+ ([+-.] \w+) *@\w+ ([-.] \w+) */";
PHP preg components of regular expressions