PHP regular expression function library (two sets)

Source: Internet
Author: User
Tags ereg php regular expression

There are two sets of regular expression function libraries in PHP. The two functions are similar, but the execution efficiency is slightly different:
One set is provided by the PCRE (Perl Compatible Regular Expression) Library. Functions with the prefix "prefix;
A set of (default PHP) extensions provided by POSIX (Portable Operating System Interface of Unix ). Use a function prefixed with "ereg;
In PHP, regular expressions have three functions:
Matching is often used to extract information from strings.
Replace the match text with the new text.
Splits a string into a group of smaller information blocks.

A regular expression contains at least one atom.
Atom (common characters, such as English characters)
Metacharacters (special characters)
Pattern correction characters (Regular Expression semantic correction)

Atom)
A single character or number, such as ~ Z, ~ Z, 0 ~ 9.
A pattern unit, such as ABC, can be understood as a large atom composed of multiple atoms.
Atomic table, such as [ABC].
The Mode unit used again, for example, \ 1.
Common escape characters, such as \ d, \ D, \ w
Escape metacharacters, for example :\*,\.

POSIX Regular Expression
POSIX regular expressions are all called Portable Operating System Interface of Unix, meaning UNIX Portable operation System implementation Interface.

The method for constructing a POSIX regular expression is the same as that for creating a mathematical expression, that is, using a variety of metacharacters and operators to combine small expressions to create a larger expression.

Metacharacters (Meta-character)
Metacharacters are special characters used to construct rule expressions. To include metacharacters in a regular expression, you must add "\" before it to escape it.
Metacharacters
* Matches the previous atom 0 times, once, or multiple times
+ Match the previous atom once or multiple times
? Matches the previous atom zero or one time
| Match two or more select columns such as [1-9] | [a-B] | [A-Z] matches with any of them to true
^ Matches the atoms at the beginning of a string, for example, abscd ==^
$ Match the atoms at the end of a string, for example, dasdsv === v $
[] Any atom in the matching square brackets, for example, s === [dsadas]
[^] Match any character except the atom in square brackets, for example, aaaaa === [dddd]
{M} indicates that the first atom exactly appears m times.
{M, n} indicates that the first atom appears at least m times and at least n times (n> m)
{M,} indicates that the first atom appears no less than m times.
() Indicates an atom.
. Match any character except line breaks

^ $ The two original characters are collectively referred to as the delimiter.
Abd = ^ abc $ only matches

Order of pattern matching
Sequence metacharacters
1 () mode Unit
2? * + {} Duplicate match
3 ^ $ Boundary Limit
4 | mode selection

POSIX Regular Expression Functions
Ereg () and eregi ()
Ereg_replace () and eregi_replace ()
Split () and spliti ()

Ereg () and eregi () ereg () string matching functions, eregi () is the version of the ereg () function that ignores the size
Syntax format: if (! Ereg ('^ [^./] [^/] * $', $ userfile) // output die in mismatched format
{
Die ('This is an invalid file name! ');
}

Replace ereg_replace () and eregi_replace (case-insensitive)
String eregi_replace ("Regular Expression", "target replacement character", and "target replacement ")
Syntax format: $ string = "This is a test ";
Echo str_replace ("is", "was", $ string );
Echo ereg_replace ("() is", "\ 1was", $ string); \ 1 is to inherit the first whole
Echo ereg_replace ("() is)", "\ 2was", $ string); \ 2 inherits the second whole

Split () and spliti (case-insensitive) use regular expressions to split strings into Arrays
List: assign some variables to values in the array.
Syntax format: $ date = "04/30/1973 ";
List ($ month, $ day, $ year) = split ('[/. -] ', $ date); // lists the formats of the three variables. // the format in which the variables are split.
Echo "Month: $ month; Day: $ day; Year: $ year <br/>
";
Output result Month: 04; Day: 30; Year: 1973

Multi-row matching

$ Rows = file ('php. ini '); // read the php. ini file to the array.

// Cycle calendar
Foreach ($ rows as $ line)
{
If (trim ($ line ))
{
// Write matched parameters to the array
If (eregi ("^ ([a-z0-9 _.] *) * = (. *)", $ line, $ matches) // loop for multi-row matching
{
$ Options [$ matches [1] = trim ($ matches [2]);
}
Unset ($ matches );
}
}

// Output parameter results
Print_r ($ options );

PCRE Regular Expression
PCRE is called Perl Compatible Regular Expression, which means Perl is Compatible with Regular expressions.
In PCRE, a pattern expression (that is, a regular expression) is usually included between two Backslash "/", such as "/apple /".

Metacharacters (Meta-character)
Metacharacters
\ A matches the atom at the beginning of A string
\ Z matches the atoms at the end of a string
\ B matches the boundary of the word/\ bis/string with the matching header is/is \ B/string with the matching tail is/\ bis \ B/Boundary
\ B matches any character except word boundary/\ Bis/matches "is" in the word "This"

\ D matches a number, which is equivalent to [0-9].
\ D matches any character except a number. It is equivalent to [^ 0-9].
\ W matches an English letter, number, or underline. It is equivalent to [0-9a-zA-Z _]
\ W matches any character except English letters, numbers, and underscores. It is equivalent to [^ 0-9a-zA-Z _]
\ S matches a blank character. It is equivalent to [\ f
\ T \ v]
\ S matches any character except the white space. It is equivalent to [^ \ f
\ T \ v]
\ F matching a page feed is equivalent to \ x0c or \ cL
Match a line break; equivalent to \ x0a or \ cJ
Matching a carriage return is equivalent to \ x0d or \ cM.
\ T matches a tab. It is equivalent to \ x09 \ or \ cl.
\ V matches a vertical tab, which is equivalent to \ x0b or \ ck
\ ONN matches an octal number
\ XNN matches a hexadecimal number
\ CC matches a control character

Pattern Modifiers)
I-matching uppercase and lowercase letters at the same time
M-treat strings as multiple rows
S-treats a string as a single line, and line breaks are treated as common characters so that "." matches any character.
The white space in X-mode is ignored.
U-match to the nearest string
E-use the replaced string as the expression
Format:/apple/I matches "apple" or "Apple", and case insensitive. /I

PCRE mode Unit
// 1. Extract the first attribute
/^ \ D {2} ([\ W]) \ d {2 }\\ 1 \ d {4} $ matches strings such as "12-31-2006", "09/27/1996", and "86 01 4321. However, the above regular expression does not match the "12/34-5678" format. This is because the result "/" of the mode "[\ W]" has been stored. When the next position "\ 1" is referenced, the matching mode is also the character "/".

When you do not need to store matching results, use the non-storage mode unit "(? :)"
For example /(? : A | B | c) (D | E | F) \ 1g/will match "aEEg ". In some regular expressions, it is necessary to use non-storage mode units. Otherwise, you need to change the subsequent reference sequence. The preceding example can also be written as/(a | B | c) (C | E | F) \ 2g /.

PCRE Regular Expression Function
Preg_match () and preg_match_all ()
Preg_quote ()
Preg_split ()
Preg_grep ()
Preg_replace ()

Match preg_match () and preg_match_all () Regular Expressions
Syntax format: if (preg_match ("/php/I", "PHP is the web scripting language of choice .")){
Print "A match was found .";
} Else {
Print "A match was not found .";
}

Preg_quote () Escape Regular Expression characters
Syntax format: $ keywords = "$40 for a g3/400 ";
$ Keywords = preg_quote ($ keywords, "/"); // escape who/escape symbol
Echo $ keywords;

Preg_split () use a regular expression to split a string
Preg_split () this function is consistent with the split function.
Syntax format: $ keywords = preg_split ("/[\ s,] +/", "hypertext language, programming ");
Print_r ($ keywords );

Preg_grep () returns the array unit that matches the pattern.
Syntax format: $ fl_array = preg_grep ("/^ (\ d + )? \. \ D + $/", $ array );

Preg_replace (): search and replace regular expressions
Syntax format:Copy codeThe Code is as follows: $ string = "CMDL 15,200 3 ";
$ Pattern = "/(\ w +) (\ d +), (\ d +)/I ";
$ Replacement = "\$ {1} 1, \ $3 ";
Print preg_replace ($ pattern, $ replacement, $ string );

Preg_match_all () for global regular expression matching
Syntax format:Copy codeThe Code is as follows: preg_match_all ("| <[^>] +> (. *) </[^>] +> | U ",
"<B> example: </B> <div align = left> this is a test </div> ",
$ Out, PREG_PATTERN_ORDER );
Print $ out [0] [0]. ",". $ out [0] [1]."
";
Print $ out [1] [0]. ",". $ out [1] [1]."
";

Output result: <B> example: </B>, <div align = left> this is a test </div> example:, this is a test

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.