PHP Learning Notes (v) Regular expressions

Last Update:2016-08-08 Source: Internet

Author: User

Tags php tutorial

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

What do you mean regular expressions

Regular expressions are a logical formula for manipulating strings by combining certain characters into a regular string called a regular matching pattern.

$p = '/apple/'; $str = "Apple Banna"; if (Preg_match ($p, $str)) {    echo ' matched ';}

where the string '/apple/' is a regular expression that matches the existence of an apple string in the source string.
PHP uses the Pcre library function for regular matching, such as the preg_match used in the previous example to perform a regular match, commonly used to determine whether a class of character patterns exist.

Basic syntax for regular expressions

in the Pcre library function, the regular match pattern consists of delimiters and metacharacters, which can be any character that is non-numeric, non-backslash, and not a space. Frequently used delimiters are forward slashes (/), hash symbols (#), and inverse symbols (~), for example:

/foo bar/#^[^0-9]$#~php~

If the pattern contains delimiters, the delimiter needs to be escaped with a backslash (\).

/http:\/\//

If the pattern contains more split characters, it is recommended that you replace other characters as separators, or you can escape them with preg_quote.

$p = ' http:/'; $p = '/'. Preg_quote ($p, '/'). ' /'; echo $p;

The pattern modifier can be used after the delimiter, including: I, M, S, X, and so on, such as using the I modifier to ignore case matching:

$str = "Http://www.imooc.com/", if (Preg_match ('/http/i ', $str)) {    echo ' matches successfully ';}

Metacharacters and escape

characters with special meanings in regular expressions are called metacharacters, and commonly used meta-characters are:

\ Generally used to escape characters ^ asserts that the target's start position (or the beginning of a line in multiline mode) is the end position of the assertion target (or the row end in multiline mode). Match any character except newline (default) [Start character class definition] End character class definition | Start an optional branch (the start tag of the subgroup) End tag of a subgroup? As a quantifier, represents 0 or 1 matches. The greedy character that is used to change quantifiers after quantifiers. (check quantifier) * quantifier, 0 or more matches + quantifiers, 1 or more matches {custom quantifier start tag} custom quantifier end tag //\s matches any whitespace character, including spaces, tabs, newline characters. [^\s] represents a non-whitespace character. [^\s]+ indicates that a non-whitespace character is matched one or more times. $p = '/^ me [^\s]+ (Apple | banana) $/'; $str = "I like to eat apples"; if (Preg_match ($p, $str)) {    echo ' matches successfully ';} metacharacters have two usage scenarios, one can be used anywhere, and the other is used only in square brackets, with the following: \ Escape character ^ is only used as the first character (in square brackets), indicating that the character class takes an inverse-tag character range where ^ is outside the parenthesis, Represents the starting position of the assertion target, but inside the square brackets represents the character class reversed, minus the square brackets-you can mark the range of characters, such as 0-9 for all numbers between 0 and 9.  //The \w below matches letters or numbers or underscores. $p = '/[\w\.\-]+@[a-z0-9\-]+\. (COM|CN)/'; $str = "My mailbox is marchalex@163.com";p reg_match ($p, $str, $match); Echo $match [0];

Greedy mode and lazy mode

each meta-character in a regular expression matches one character, and after using + it will become greedy, it will match as many characters as possible, but with a question mark? character, it will match as few characters as possible, both lazy mode.
Greedy mode: When the match and can not match, the first match

The following \d indicates a matching number $p = '/\d+\-\d+/'; $str = "My phone is 010-12345678";p reg_match ($p, $str, $match); Echo $match [0]; The result is: 010-12345678

Lazy Mode: priority mismatch when matching and mismatches are not matched

$p = '/\d?\-\d?/'; $str = "My phone is 010-12345678";p reg_match ($p, $str, $match); Echo $match [0];  The result is: 0-1

you can use {} to specify the number of characters to match when we know exactly the length of the match

$p = '/\d{3}\-\d{8}/'; $str = "My phone is 010-12345678";p reg_match ($p, $str, $match); Echo $match [0]; The result is: 010-12345678

use greedy mode to match names in strings. (Hint: \w matches letters or numbers or underscores, \s matches any whitespace character, including spaces, tabs, newline characters)

$p = '/name: ([\w\s]+)/'; $str = "Name:steven Jobs";p reg_match ($p, $str, $match); Echo $match [1]; The result: Steven Jobs

Using regular expressions to match

the purpose of using regular expressions is to implement a more flexible approach than string handlers, so it is primarily used to determine whether a substring exists, string substitution, split string, get pattern substring, and so on, just like a string handler function.
PHP uses the Pcre library function for regular processing, by setting the pattern, and then invoking the relevant handler function to get the matching result.
The Preg_match is used to perform a match, which can be simply used to determine whether the pattern matches successfully, or to get a match, and his return value is 0 or 1 of the successful match, which stops the search after 1 times.

$subject = "ABCdef"; $pattern = '/def/';p reg_match ($pattern, $subject, $matches);p Rint_r ($matches); The result is: Array ([0] = def)

The above code simply performs a match and simply determines whether the DEF matches the success, but the powerful place of the regular expression is the pattern match, so more often, the pattern is used:

$subject = "ABCdef"; $pattern = '/A (. *?) D/';p reg_match ($pattern, $subject, $matches);p Rint_r ($matches); The result is: Array ([0] = ABCD [1] = BC)

Regular expressions can be used to match a pattern to get more useful data.
Example: Write code that uses Preg_match to match a mailbox in a string and output the mailbox.

$subject = "My email is spark@imooc.com"; $pattern = '/[\w\-]+@\w+\.\w+/';p reg_match ($pattern, $subject, $matches); echo $ Matches[0];

Find all matching results

Preg_match can only match one result at a time, but many times we need to match all the results, Preg_match_all can iterate over the array of matching results for a list.

$p = "|<[^>]+> (. *?)
 
   ] +>|i "; $str ="This is a test ";p reg_match_all ($p, $str, $matches);p Rint_r ($matches);

you can use Preg_match_all to match the data in a table:

$p = "/(. *?) <\/td>\s* (. *?) <\/td>\s*<\/tr>/i "; $str ="
 
  
   
  
   
    
     
     Alex 
     25 
     
     
     John 
     26 
     
   
 
  
  
";p Reg_match_all ($p, $str, $matches);p Rint_r ($matches);

$matches results sorted to $matches[0] saves all matches for the full pattern, $matches [1] saves all matches for the first subgroup, and so on.
Example: Use Preg_match_all to match data from all Li tags.

 
              Item 1
            Item 2
        ";//implement a regular match for all data in Li $p ="/(. *) <\/li>/i ";//explain this regular://The following I means case insensitive,
(.*?) <\/li> represents the number of values within the Li tag that match (), in parentheses. Represents all single characters, * indicates a quantity of 0 or more. That is, the characters in the Li tag are displayed preg_match_all ($p, $str, $matches);p rint_r ($matches [1]);

Search and replace of regular expressions

The search and substitution of regular expressions have important uses in some aspects, such as adjusting the format of the target string, changing the order of matching strings in the target string, and so on.
For example, we can simply adjust the date format of the string:

$string = ' April ', $pattern = '/(\w+) (\d+), (\d+)/I '; $replacement = ' $ $, ${1} "; Echo preg_replace ($pattern, $re placement, $string); The result is:, April 15

where ${1} is equivalent to the $ $ notation, which represents the first matched string, which is the second matching one.
with complex patterns, we can replace the contents of the target string more precisely.

$ $patterns = Array ('/(19|20) (\d{2})-(\d{1,2})-(\d{1,2})/',                   '/^\s*{(\w+)}\s*=/'), $replace = Array (' \3/\4/\1\2 ', ' $\1 = ');//\3 is equivalent to $3,\4 equivalent to $4, and so on Echo preg_replace ($patterns, $replace, ' {startdate} = 1999-5-27 '); The result is: $startDate = 5/27/1999//detailed explanation of the result: (19|20) means to take 19 or 20 of any number, (\d{2}) represents two numbers, (\d{1,2}) represents 1 or 2 digits (\d{1,2}) Represents a 1 or 2 number. ^\s*{(\w+) \s*=} represents a character that begins with any space and contains the characters in {} and ends with any space, with the last = sign.

use regular substitution to remove extra spaces and characters:

$ $str = ' one     '; $str = preg_replace ('/\s+/', ' ', $str); echo $str;//result changed to ' one '

Example: Replace the file name in the target string $str with an EM tag, for example index.php to be replaced with index.php.

$str = ' Mainly have the following files: index.php, style.css, common.js ';//replace the file name in the target string $str with the increment of em mark $p = '/\w+\.\w+/i '; $str = Preg_replace ($p, '$ ', $str); Echo $str;

Regular matching common cases

regular matching is commonly used in form validation, some fields will have a certain format requirements, such as the user name is generally required to be letters, numbers or underscores, mailboxes, telephones and so on have their own rules, so use regular expressions can be very good validation of these fields.
we take a look at the General User registration page and how to verify the field.

 
  ' Marchalex ',    ' email ' = ' marchalex@163.com ',    ' mobile ' = ' 13312345678 ');//Perform general validation if (empty ($user)) { Die    (' user information cannot be null ');} if (strlen ($user [' name ']) < 6) {die    (' user name is at least 6 bits long ');} The user name must be a letter, a number and an underscore if (!preg_match ('/^\w+$/i ', $user [' name '])) {die    (' username not valid ');} Verify that the mailbox format is correct if (!preg_match ('/^[\w\. +@\w+\.\w+$/i ', $user [' email ']) {die    (' email not valid ');} The phone number must be 11 digits, and 1 starts if (!preg_match ('/^1\d{10}$/i ', $user [' Mobile '])) {die    (' mobile phone number is not legal ');} Echo ' User information verified successfully ';

The above describes the PHP learning notes (five) regular expressions, including the content of the text, I hope that the PHP tutorial interested in a friend helpful.



This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More