Regular expressions
A regular expression is a string search tool and a matching tool
The regular expression functions commonly used in PHP are as follows:
Preg_match ($pattern, $subject) Form validation, etc.
Preg_match_all ($pattern, $subject,array& $matches)
Preg_replace ($pattern, $replacement, $subject) illegal word filtering, etc.
Preg_filter ($pattern, $replacement, $subject)
Preg_grep ($pattern, array, $input)
Preg_split ($pattern, $subject)
Preg_quote ($STR)
$pattern = Regular Expression
$subject = matching Target data
Regular expression basic Syntax delimiter
$pattern = '/[0-9]/';
#[0-9]#
{[0-9]}
Atomic
Characters that are visible to the naked eye with the keyboard output in a Unicode encoding table
What are they?
Punctuation: "_?. Wait a minute
--English alphanumeric A-Z, a-z,0-9
--Kanji, Japanese, Arabic and other language characters
--Physics Formula symbol
--Other visible characters
Line break \ n
Enter \ r
TAB \ t
Space
Other Invisible Symbols
Quantifiers
{n} indicates that the atom in front of it appears exactly n times
{N,} indicates that the atom in front of it appears at least n times
{N,m} indicates that the atom in front of it appears at least n times, with a minimum of M times
* Match 0, 1, or more of its previous atoms, i.e. {0,}
+ match 1 or more times before the atom, i.e. {1,}
? Matches 0 or 1 times its previous atom, which is {0,1}
Border control
^ matches where the string starts
$ matches the position of the end of the string
() match the whole of which is an atom
Mode unit
Correction mode: Default is greedy mode, lazy mode is actually added a capital of U such as/123456/u
Common correction modes
U: Lazy Match
U: Greedy match
I: Ignore the case of English letters
x: Ignore whitespace (enter, space, etc.)
S-let meta-character. Match all characters including line break
REGEXPL Regular Debugging Tools
In order to avoid the dislocation caused by different coding formats, it is recommended to convert the Chinese into Unicode encoding first.
Metacharacters
How to filter atoms
| Match two or more
[] matches any one of the atoms in the square brackets
[^] matches any character except for the atoms in square brackets
Metacharacters
The collection of atoms
Match any character except line break
. Any character other than a line break
\d matches any decimal number, i.e. [0-9]
\d matches any non-decimal number, i.e. [^0-9]
\s matches an invisible atom, i.e. [\f\n\r\t\v]
\s matches a visible atom, i.e. [^\f\n\r\t\v]
\w matches any number, letter, or underscore, i.e. [0-9a-za-z]
W: matches any word character that includes an underscore. Similar but not equivalent to "[a-za-z0-9_]", where the "word" character uses the Unicode character set.
\w: Matches any non-word character. Equivalent to "[^a-za-z0-9_]".
If you use Re.sub (R ' [_|\w] ', ', text) to remove all non-words or underscores "_",
If you use Re.sub (R ' [_|\w] ', ' x ', text), all ' _ ' and word characters will be replaced with ' X '
Common Regular Expressions
. + Non-null
1 (3|4|5|7|8) \d{9} matches the mobile number in mainland China
^\w+ (\.\w+) *@\w+ (\.\w+) +$ Verify the mailbox
^ (https?:/ /)? (\w+\.) +[a-za-z]+$ URL.
Development of Cottage Smarty template engine
How the template engine works
Get template source Files
Compile template (regular replacement)
Output to User
Imitation smarty simple template engine, in a standard PHP system, the template engine must be, for front-end engineers and back-end engineers to separate the work, and front-end engineers do not need to understand the backend code.
Online Debugging Tools
http://cs.smu.ca/~porter/csc/355/regexpal/
?: used when you don't want to be captured can improve program execution speed
Regular Expression Basics Tutorial (PHP)