There are two sets of regular expression libraries in PHP that are similar in functionality, but have slightly different execution efficiencies:
A set is provided by the Pcre (Perl compatible Regular Expression) library. A function named after a prefix using "preg_";
A set of (PHP default) extensions provided by POSIX (Portable operating System Interface of Unix). Use a function named "Ereg_" as a prefix;
In PHP, regular expressions have three effects:
Match, and is often used to extract information from a string.
Replaces the matching text with the new text.
Splits a string into a smaller set of pieces of information.
Contains at least one atom in a regular expression.
Atoms (ordinary character, such as English characters)
Metacharacters (characters with special functions)
Pattern correction character (correction of regular expression semantics)
Atom (Atom)
A single character, number, such as a~z,a~z,0~9.
A pattern unit, such as (ABC), can be understood as a large atom composed of multiple atoms.
Atomic tables, such as [ABC].
A reusable mode unit, such as: \\1
Normal escape characters, such as: \d, \d, \w
Escape meta characters, such as: \*,\.
POSIX regular Expressions
POSIX regular expressions are all called portable operating system Interface of Unix, meaning that the UNIX portable operation system implements the interface.
The method of constructing a POSIX regular expression is the same as creating a mathematical expression, that is, combining a small expression with a variety of metacharacters and operators to create a larger expression.
Metacharacters (Meta-character)
A metacharacters is a character that is used to construct a regular expression with special meaning. If you want to include the metacharacters themselves in a regular expression, you must precede it with "\" to escape
Metacharacters description
* 0 times, 1 times or more to match the atoms before them
+ 1 or more times to match the atoms before it
? 0 times or 1 times to match the atoms before them
| Match two or more selection columns, such as [1-9]| [a-b]| [A-z] matches any of them as ture
^ matches the atoms of a string string first such as ABSCD===^AFDGFGF
$ matches the tail of a string of atoms such as dasdsv===v$
[] matches any atom in square brackets such as S===[dsadas]
[^] matches any character except the atom in square brackets such as aaaaa===[dddd]
{m} indicates that the former atom appears just as M times
{M,n} indicates that its front atom appears at least m times, at least N times (n>m)
{m,} indicates that its front atom appears not less than m times
() The whole represents an atom
. Matches any character other than a newline
^ $ These two original characters together are called the delimitation
Abd===^abc$ that's the only way to match.
Sequence of pattern matches
Sequential Meta-character description
1 () mode unit
2? * +{} Duplicate match
3 ^$ Boundary limit
4 | Mode selection
POSIX Regular expression functions
Ereg () and eregi ()
Ereg_replace () and Eregi_replace ()
Split () and Spliti ()
Ereg () and Eregi () ereg () string matching function, Eregi () is the version of the ignored size of the ereg () function
Syntax format: if (!ereg (' ^[^./][^/]*$ ', $userfile))//mismatched format output die
{
Die (' This is an illegal filename! ');
}
Ereg_replace () and eregi_replace (ignore case) replace
String Eregi_replace ("Regular expression", "target substitution character", "replace Target")
Syntax format: $string = "This is a test";
Echo str_replace (' is ', ' was ', $string);
Echo ereg_replace ("() is", "\\1was", $string); \\1 to inherit the first whole
Echo Ereg_replace (() is), "\\2was", $string); \\2 inherits the second whole
Split () and Spliti (ignore case) splits a string into an array with a regular expression
List: Assigning variables to values in an array
Grammatical format: $date = "04/30/1973";
List ($month, $day, $year) = Split (' [/.-] ', $date);//list three variables corresponding format/split split in what form
echo "Month: $month; Day: $day; Year: $year <br/>
";
The output result month:04; day:30; year:1973
Multiple-line matching
$rows = File (' php.ini '); Read the php.ini file to the array
Circulation calendar
foreach ($rows as $line)
{
if (Trim ($line))
{
Writes the parameters that match successfully to the array
if (eregi ("^ [a-z0-9_.] *) *= (. *) ", $line, $matches))//loop for multiline matching
{
$options [$matches [1]] = Trim ($matches [2]);
}
Unset ($matches);
}
}
Output parameter Results
Print_r ($options);
Pcre Regular Expression
Pcre is all called Perl compatible Regular Expression, meaning Perl-compatible regular expressions.
In Pcre, a pattern expression (that is, a regular expression) is typically included between two backslashes "/", such as "/apple/".
Metacharacters (Meta-character)
Metacharacters description
\a an atom that matches the first string of strings
\z the atoms that match the tail of string strings
\b Matches the bounds of the word/\bis/the string that matches the header to the is/is\b/the string that matches the end is the/\bis\b/delimitation
\b matches any character except the word boundary/\bis/matches the "is" in the word "this"
\d match a number; equivalent to [0-9]
\d matches any character except a number; equivalent to [^0-9]
\w match an English letter, number, or underscore; equivalent to [0-9a-za-z_]
\w matches any character except English letters, numbers, and underscores; equivalent to [^0-9a-za-z_]
\s matches a white-space character; equivalent to [\f
\t\v]
\s matches any one character except whitespace; equivalent to [^\f
\t\v]
\f matches a page feed character equivalent to \x0c or \CL
Match a newline character; equivalent to \x0a or \CJ
Matching a return character is equivalent to \x0d or \cm
\ t matches a tab character, equivalent to \x09\ or \CL
\v matches a vertical tab; equivalent to \x0b or \ck
\onn match a octal number
\XNN matches a hexadecimal digit
\CC matches a control character
mode modifier (Pattern modifiers)
I-can match uppercase and lowercase letters at the same time
M-Treat a string as multiple lines
S-Treats the string as a single line, and the newline character is treated as ordinary characters so that "." Match any character
X-whitespace negligible in pattern
U-Match to the nearest string
E-Use the replaced string as an expression
Format:/apple/i matches "Apple" or "apple" and so on, ignoring case. /I
Pcre Mode Unit
1 extracting first-bit attributes
/^\d{2} ([\w]) \d{2}\\1\d{4}$ matches strings such as "12-31-2006", "09/27/1996", "86 01 4321". However, these regular expressions do not match the format of "12/34-5678". This is because the result "/" of the Mode "[\w]" has been stored. The match pattern is also the character "/" when the next position "\1" is referenced.
Use a non-storage mode unit when you do not need to store matching results (? :)”
For example/(?: A|b|c) (d| e| F) \\1g/will match "AEEg". In some regular expressions, it is necessary to use a non-storage-mode unit. Otherwise, the order of subsequent references needs to be changed. The above example can also be written/(A|B|C) (c| e| F) \2g/.
Pcre Regular Expression function
Preg_match () and Preg_match_all ()
Preg_quote ()
Preg_split ()
Preg_grep ()
Preg_replace ()
Matching of Preg_match () and Preg_match_all () regular expressions
Syntax format: if (Preg_match ("/php/i", "PHP is the Web scripting language of choice.")) {
Print "A match was found.";
} else {
Print "A match is not found.";
}
Preg_quote () escape Regular expression characters
Syntax format: $keywords = "$ for a g3/400";
$keywords = Preg_quote ($keywords, "/");/escape who/Escape symbol
Echo $keywords;
Preg_split () split string with regular expression
Preg_split () This function is consistent with the function of the split function.
Syntax format: $keywords = Preg_split ("/[\s,]+/", "Hypertext language, Programming");
Print_r ($keywords);
Preg_grep () returns the array cells that match the pattern
Syntax format: $FL _array = Preg_grep ("/^ (\d+) \.\d+$/", $array);
Preg_replace () performs search and replace of regular expressions
Syntax format:
Copy Code code as follows:
$string = "April 15, 2003";
$pattern = "/(\w+) (\d+), (\d+)/I";
$replacement = "\${1}1,\$3";
Print Preg_replace ($pattern, $replacement, $string);
Preg_match_all () for global regular expression matching
Syntax format:
Copy Code code as follows:
Preg_match_all ("|<[^>]+> (. *) </[^>]+>| U ",
"<b>example: </b><div align=left>this is a test</div>",
$out, Preg_pattern_order);
Print $out [0][0]. ",". $out [0][1]. "
";
Print $out [1][0]. ",". $out [1][1]. "
";
Output: <b>example: </b>, <div Align=left>this is a test</div> example:, this is a test