PHP regular expression syntax summary

Source: Internet
Author: User
Tags character classes email string ereg php regular expression preg
: This article mainly introduces the PHP regular expression syntax summary. if you are interested in the PHP Tutorial, you can refer to it. Http:// Mod = viewthread & tid = 4101636.

Use well Regular expressionPHP is often used to get twice the result with half the effort. Regular expressionFor more information, see.
First, let's take a look at two special characters: '^' and '$'. they are used to match the start and end of the string, respectively.
"^ The": matches strings starting with ";
"Of despair $": match the string ending with "of despair;
"^ Abc $": matches strings starting with abc and ending with abc. In fact, only abc matches.
"Notice": match a string containing notice
You can see that if you do not use the two characters we mentioned (the last example), that is, the mode ( Regular expressionCan appear anywhere in the string to be tested. you didn't lock it to either side.
How many characters are there: '*', '+', and '? ', Which indicates the number or sequence of occurrences of a character. they indicate: "zero or more", "one or more", and "zero or one. "Here are some examples:
"AB *": matches strings a and 0 or more B ("a", "AB", "abbb", etc .);
"AB +": same as above, but at least one B ("AB", "abbb", etc .);
"AB? ": Matches 0 or 1 B;
"? B + $ ": match the string ending with one or zero a plus more than one B.
You can also limit the number of characters in braces, such
"AB {2}": match a record with two bits (one cannot be smaller) ("abb" regular expressionSyntax summary ">;
"AB {2,}": at least two B ("abb", "abbbb", etc .);
"AB {3, 5}": 2-5 B ("abbb", "abbbb", or "abbbbb" regular expressionSyntax summary ">.
You must also note that you must always specify (I. e, "{0, 2}", not "{, 2}" regular expressionSyntax summary ">. Similarly, you must note that '*', '+', and '? 'Are the same as the following three range annotations: "{0,}", "{1,}", and "{0, 1 }".
Put a certain number of characters in parentheses, for example:
"A (bc) *": Match a with 0 or a "bc ";
"A (bc) {}": one to five "bc ."
There is also a character '│', which is equivalent to the OR operation:
"Hi │ hello": match string containing "hi" or "hello;
"(B │ cd) ef": matches strings containing "bef" or "cdef;
"(A │ B) * c": the matching contains multiple (including 0) a or B, followed by a c
A point ('.') can represent all single characters:
"A. [0-9]": a string with a character and a number (a string containing such a string will be matched, and this bracket will be omitted later)
"^. {3} $": ends with three characters.
The content enclosed in brackets only matches a single character.
"[AB]": Match a or B (same as "a │ B );
"[A-d]": match a single character from 'A' to 'D' (same effect as "a │ B │ c │ d" and "[abcd );
"^ [A-zA-Z]": matches a string starting with a letter.
"[0-9] %": match a string containing x %
", [A-zA-Z0-9] $": match a string ending with a comma plus a number or letter
You can also column the characters you don't want in brackets. you just need to use '^' in the brackets to start with (I. e ., "% [^ a-zA-Z] %" matches a non-letter string with two percentage signs ).
To be able to explain, but "^. [$ () │ * +? {/"As a special character, you must add'' in front of these characters, and avoid using/at the beginning of the pattern in php3, for example, Regular expression"(/$ │? [0-9] + "ereg (" (// $ │? [0-9] + ", $ str) (I don't know if php4 is the same)
Do not forget that the characters in brackets are exceptions of this rule-in brackets, all Special characters, Including (''), will all lose their special properties (I. e.," [*/+? {}.] "Match strings containing these characters ). also, as the regx manual tells us: "If the list contains ']', it is best to use it as the first character in the list (possibly following '^ ). if it contains '-', it is best to put it at the beginning or the end, or the second end point of a range (I. e. the '-' in the [a-d-0-9] will be valid.
For completeness, I should involve collating sequences, character classes, and equivalence classes. however, I do not want to elaborate on these aspects, and these articles do not need to be involved. you can get more messages in regex man pages.
How to build a pattern to match the number of currency input
Now let's use what we have learned to do something useful: Build a matching pattern to check whether the input information is a number that represents money. We think there are four ways to indicate the number of money: "10000.00" and "10,000.00", or no decimal part, "10000" and "10,000 ". now let's start building this matching mode:
^ [1-9] [0-9] * $
This variable must start with a number other than 0, but it also means that a single "0" cannot pass the test. The following is a solution:
^ (0 │ [1-9] [0-9] *) $
"Only numbers starting with 0 and not 0 match", we can also allow a negative number before the number:
^ (0 │ -? [1-9] [0-9] *) $
This is: "0 or a digit starting with 0 may have a negative number in front. "Well, now let's not be so rigorous. we can start with 0. now let's give up the negative number, because we don't need to use it to represent coins. we now specify a pattern to match the fractional part:
^ [0-9] + (/. [0-9] + )? $
This implies that the matched string must start with at least one Arabic number. but note that in the above mode, "10. "It does not match. only" 10 "and" 10.2 "are allowed. (Do you know why)
^ [0-9] + (/. [0-9] {2 })? $
We have specified two decimal places. if you think this is too harsh, you can change it:
^ [0-9] + (/. [0-9] {1, 2 })? $
This will allow one or two decimal places. Now we add a comma (every three digits) to increase readability, which can be expressed as follows:
^ [0-9] {1, 3} (, [0-9] {3}) * (/. [0-9] {1, 2 })? $
Do not forget the plus sign '+' to be multiplied by '*'. if you want to allow blank strings to be input (why ?). Also, do not forget the backslice bar '/'. errors may occur in php strings (common errors ). now we can confirm the string. now we can remove all the commas (,) from str_replace (",", "", $ money) then we can regard the type as double and then use it for mathematical computation.
Construct the email Regular expression
Okay. let's continue to discuss how to verify an email address. there are three parts in a complete email address: POP3 user name (everything on the left of '@'), '@', and server name (that is, the remaining part ). the user name can contain uppercase and lowercase letters, Arabic numerals, periods ('. '), minus sign ('-'), and underline ('_'). the server name also complies with this rule, except for the underlines.
The start and end of the user name cannot be a period. the same is true for servers. you cannot have at least one character between two consecutive periods. now let's take a look at how to write a matching pattern for the user name:
^ [_ A-zA-Z0-9-] + $
The end cannot exist yet. we can add the following:
^ [_ A-zA-Z0-9-] + (/. [_ a-zA-Z0-9-] +) * $
The above means: "There is at least one canonicalized character (except. unexpected), followed by 0 or more strings starting with a point ."
To simplify it, we can use eregi () to replace ereg (). eregi () is case-insensitive and we don't need to specify two ranges: "a-z" and "A-Z"-just specify one:
^ [_ A-z0-9-] + (/. [_ a-z0-9-] +) * $
The server name is the same, but the underline should be removed:
^ [A-z0-9-] + (/. [a-z0-9-] +) * $
Done. now you only need to use @ to connect the two parts:
^ [_ A-z0-9-] + (/. [_ a-z0-9-] +) * @ [a-z0-9-] + (/. [a-z0-9-] +) * $
This is the complete email authentication matching mode. you only need to call
Eregi ('^ [_ a-z0-9-] + (/. [_ a-z0-9-] +) * @ [a-z0-9-] + (/. [a-z0-9-] +) * $ ', $ eamil)
Then you can check whether the email is used.
Regular expressionOther usage
Extract string
Ereg () and eregi () has a feature that allows users Regular expressionExtract part of the string (you can read the manual for specific usage). For example, we want to extract the file name from path/URL-the following code is what you need:
Ereg ("([^ //] *) $", $ pathOrUrl, $ regs );
Echo $ regs [1];
Advanced replacement
Ereg_replace () and eregi_replace () are also very useful: if we want to replace all the negative signs at intervals with commas:
Ereg_replace ("[/n/r/t] +", trim ($ str ));
PHP is widely used in Web background CGI development. it usually produces some results after user data. However, if the user input data is incorrect, a problem may occur, for example, a person's birthday is "August February 30 "! How can we check whether the summer vacation is correct? Added Regular expressionSo that we can easily perform data matching.
2. what is Regular expression:
To put it simply, Regular expressionIt is a powerful tool for pattern matching and replacement. Find Regular expressionTrace, such as Perl or PHP script language. In addition, the script language of the JavaScript client also provides Regular expressionNow Regular expressionIt has become a common concept and tool and is widely used by various technical personnel.
On a Linux website: "If you ask Linux fans what they like most, they may answer Regular expressionIf you ask him what he is most afraid of, he will definitely say Regular expression. "
As mentioned above, Regular expressionIt looks very complex, people are afraid, Jinan website construction to tell everyone is most of the PHP beginners will skip this, continue the following learning, but PHP in Regular expressionIt is a pity that you can use pattern matching to find a qualified string, determine whether the string meets the conditions, or use a specified string to replace a qualified string ......
3 Regular expressionBasic syntax:
One Regular expressionIt is divided into three parts: separator, expression and modifier.
Separators can be Special charactersAny other character (such "/! ", Etc.), the commonly used separator is "/". Expressions are composed Special characters( Special charactersSee the following) and non-special string composition, such as "[a-z0-9 _-] + @ [a-z0-9 _-.] +" can match a simple email string. Modifier is used to enable or disable a function/mode. The following is a complete Regular expressionExample:
/Hello. +? Hello/is
The above Regular expression"/" Is a separator, and two "/" are expressions. the string "is" after the second "/" is a modifier.
If the expression contains delimiters, you need to use the escape symbol "/", such as "/hello. +? // Hello/is ". Escape characters can be executed in addition to separators. Special characters, All composed of letters Special characters"/" Is required for escape. for example, "/d" represents all numbers.
4 Regular expressionOf Special characters:
Regular expressionIn Special charactersIt can be divided into metacharacters and positioning characters.
Metacharacters are Regular expressionA special character is used to describe how the leading character (that is, the character before the metacharacters) appears in the matched object. Metacharacters are single characters, but different or identical metacharacters can be combined to form large metacharacters.
Braces: braces are used to precisely specify the number of occurrences of matching metacharacters, for example, "/pre {}/" indicates that the matched objects can be "pre", "pree", and "preeeee". In this way, one to five "e" appear after "pr ".. Or "/pre {, 5}/" indicates that pre appears between 0 and 5 times.
Plus sign: the "+" character is used to match the character before the metacharacters once or multiple times. For example, "/ac +/" indicates that the matched object can be "act", "account", and "acccc". one or more "c" objects appear after "". string. "+" Is equivalent to "{1 ,}".
Asterisk: "*" is used to match zero or multiple times before the metacharacters. For example, "/ac */" indicates that the matched object may be "app", "acp", and "accp". there may be zero or multiple "c" after "". "*" Is equivalent to "{0 ,}".
Question mark :"? "Character is used to match the character before the metacharacter zero or one time. For example, "/ac? /"Indicates that the matched object can be" a "," acp ", and" acwp ". In this way, zero or one" c "string appears after". "? "In Regular expressionThere is also a very important role, that is, "greedy mode ".
There are two more important Special charactersIs "[]". They can match any character in "[]", for example, "/[az]/" can match a single character "a" or "z "; if you change the expression above to "/[a-z]/", you can match any single lowercase letter, such as "a" and "B.
If "^" is displayed in "[]", this expression does not match the characters in, for example, "/[^ a-z]/" does not match any lower-case letters! And Regular expressionThe following default values are provided:
[: Alpha:]: match any letter
[: Alnum:]: match any letter or number
[: Digit:]: match any number
[: Space:]: matches space characters.
[: Upper:]: match any uppercase letter
[: Lower:]: match any lowercase letter
[: Punct:]: match any punctuation marks
[Regular expressionSyntax summary "> digit:]: match any hexadecimal number
In addition Special charactersThe meaning of the escape symbol "/" is as follows:
S: matches a single space character.
S: used to match all characters except a single space character.
D: used to match numbers from 0 to 9, which is equivalent to "/[0-9]/".
W: used to match letters, numbers or underscores, equivalent to "/[a-zA-Z0-9 _]/".
W: used to match all characters that do not match w, equivalent to "/[^ a-zA-Z0-9 _]/".
D: used to match any non-decimal numeric characters.
.: Used to match all characters except line breaks. if the modifier "s" is modified, "." can represent any character.
Use the preceding Special charactersYou can easily express some tedious pattern matching. For example, "// d0000 /" Regular expressionIt can match an integer string above 100,001.
Positioning character:
Positioning character is Regular expressionIt is used to describe the position of a character in a matching object.
^: Indicates that the matching mode appears at the beginning of the matching object (different from that in)
$: Indicates that the matching mode appears at the end of the matching object.
Space: indicates that the matching mode appears at one of the two boundaries at the beginning and end.
"/^ He/": it can match strings starting with "he", such as hello and height;
"/He $/": Can Match strings ending with "he", that is, "she;
"/He/": starts with a space. it matches a string starting with "he" as ^;
"/He/": the end of the space. it matches the string ending with "he" as $;
"/^ He $/": indicates that it only matches the string "he.
Regular expressionIn addition to user matching, you can also use parentheses () to record the required information, store it, and read the following expressions. For example:
/^ ([A-zA-Z0-9 _-] +) @ ([a-zA-Z0-9 _-] +) (. [a-zA-Z0-9 _-]) $/
Is to record the user name of the mail address, and the server address of the mail address (in the form of and so on), in the end if you want to read the string recorded, you only need to use the "escape character + record order" to read. For example, "/1" is equivalent to the first "[a-zA-Z0-9 _-] +", "/2" is equivalent to the second ([a-zA-Z0-9 _-] + ), "/3" is the third (. [a-zA-Z0-9 _-]). However, in PHP, "/" is a special character and needs to be escaped. Therefore, "" when it comes to PHP expressions, it should be written as "// 1 ".
Other special symbols:
"|": Or symbol "|" is the same as or in PHP, but it is a "|" instead of two "| "! It can be a character or another string, for example, "/abcd | dcba/" may match "abcd" or "dcba ".
5 greedy mode:
As mentioned in metacharacters "? "Another important role is" greedy mode ". what is" greedy mode?
For example, we want to match the string ending with the letter "a" and the letter "B", but the string to be matched contains many "B" after "", for example, "a bbbbbbbbbbbbbbbbb ", Regular expressionWill it match the first "B" or the last "B? If you use the greedy mode, it will match the last "B" and vice versa, it will only match the first "B ".
The expression for greedy mode is as follows:
/A. +? B/
/A. + B/U
The greedy mode is not used as follows:
/A. + B/
The above uses a modifier U. for details, see the following section.
6 modifier:
In Regular expressionThe modifier inside can change many features of the regular expression, so that Regular expressionMore suitable for your needs (note: modifiers are case sensitive, which means "e" is not equal to "E "). Regular expressionThe modifier is as follows:
I: if "I" is added to the modifier, the regular expression will be case insensitive, that is, "a" and "A" are the same.
M: The default regular start "^" and end "$" means the start and end of each line of the string if "m" is added to the modifier of the regular string: each line starts with "^" and ends with "$ ".
S: if "s" is added to the modifier, the default "." indicates that any character except the line break will become any character, that is, include a line break!
X: if this modifier is added, spaces in the expression will be ignored unless it has been escaped.
E: This modifier is only useful for replacement, which indicates to use as PHP code in replacement.
A: If this modifier is used, the expression must be the start part of the matched string. For example, "/a/A" matches "abcd ".
E: opposite to "m", if this modifier is used, "$" matches the end of the absolute string instead of the line break. this mode is enabled by default.
U: Similar to question mark, used to set "greedy mode ".
7 PCRE-related Regular expressionFunction:
PHP Perl compatibility Regular expressionMultiple functions are provided, including pattern matching, replacement, and matching quantity:
1. preg_match:
Function format: int preg_match (string pattern, string subject, array [matches]);
This function uses the pattern expression in the string for matching. if [regs] is given, the string will be recorded in [regs] [0, [regs] [1] indicates the first string recorded using parentheses "()", [regs] [2] indicates the second string recorded, and so on. If a matched pattern is found in the string, "true" is returned; otherwise, "false" is returned ".
2. preg_replace:
Function format: mixed preg_replace (mixed pattern, mixed replacement, mixed subject );
This function replaces all strings matching the expression pattern with the expression replacement. If replacement needs to contain some characters of pattern, you can use "()" to record it. in replacement, you only need to use "/1" to read it.
3. preg_split:
Function format: array preg_split (string pattern, string subject, int [limit]);
This function is the same as the function split. The difference is that it can be used simply with the split function. Regular expressionAnd preg_split uses fully Perl Compatible Regular expression. The third parameter "limit" indicates the number of values that meet the conditions allowed to be returned.
4. preg_grep:
Function format: array preg_grep (string patern, array input );
This function basically works with the preg_match function. However, preg_grep can match all elements in the input of the given array and return a new array.
The following is an example. for example, we want to check whether the Email address format is correct:
The code is as follows:
Function emailIsRight ($ email ){
If (preg_match ("^ [_/. 0-9a-z-] + @ ([0-9a-z] [0-9a-z-] + /.) + [a-z] {2, 3} $ ", $ email )){
Return 1;
Return 0;
If (emailIsRight ('y10k @') echo 'is correct
If (! EmailIsRight ('y10k @ fffff') echo 'incorrect
The above program will output "correct"
Incorrect ".
8. Perl compatibility in PHP Regular expressionAnd Perl/Ereg Regular expressionDifferences:
Although it is called "Perl compatibility ", Regular expression", But with Perl's Regular expressionIn comparison, PHP is still different. for example, the modifier "G" indicates all matches in Perl, but it is not supported in PHP.
There is also the difference with the ereg series functions, ereg is also provided in PHP Regular expressionFunction, but it is much weaker than the preg.
1. ereg does not need or use delimiters and modifiers. Therefore, ereg is much weaker than preg.
2. about ".": in a regular expression, all characters except line breaks are generally entered, but "." in ereg is any character, that is, line breaks! If you want "." to include line breaks in the preg, you can add "s" to the modifier ".
3. ereg uses greedy mode by default and cannot be modified. This causes a lot of trouble for replacement and matching.
4. speed: This may be a concern of many people. Will the preg feature be powerful in exchange for speed? Don't worry, the preg speed is much faster than ereg. I did a program test:
Time test:
PHP code:
The code is as follows:
Echo "regular expressionSyntax summary "> reg_replace used time :";
$ Start = time ();
For ($ I = 1; $ I <= 100000; $ I ++ ){
$ Str = "ssssssssssssssssssssssssssssss ";
Preg_replace ("/s/", "", $ str );
$ Ended = time ()-$ start;
Echo $ ended;
Ereg_replace used time :";
$ Start = time ();
For ($ I = 1; $ I <= 100000; $ I ++ ){
$ Str = "ssssssssssssssssssssssssssssss ";
Ereg_replace ("s", "", $ str );
$ Ended = time ()-$ start;
Echo $ ended;
Str_replace used time :";
$ Start = time ();
For ($ I = 1; $ I <= 100000; $ I ++ ){
$ Str = "sssssssssssssssssssssssssssssssss ";
Str_replace ("s", "", $ str );
$ Ended = time ()-$ start;
Echo $ ended;
Preg_replace used time: 5
Ereg_replace used time: 15
Str_replace used time: 2
Str_replace is faster than ereg_replace because it does not need to be matched.
9. PHP3.0 support for preg:
Preg support is added by default in PHP 4.0, but it does not exist in 3.0. If you want to use the preg function in 3.0, you must load the php3_pcre.dll file. you only need to add "extension = php3_pcre.dll" in the extension section of php. ini and then restart PHP!
Actually Regular expressionThis method is also commonly used in UbbCode implementation. many PHP forums use this method, but the specific code is relatively long.

The above describes the PHP regular expression syntax summary, including special characters, regular expressions, and hope to be helpful to friends who are interested in PHP tutorials.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.