Chapter 4 data processing-php regular expression-zheng AQI (continued)

Source: Internet
Author: User
Tags ereg php regular expression

1. Basic knowledge of Regular Expressions
Meaning: String mode consisting of common characters (a-z) and special characters
Function: Validity verification.
Replace text.
Extract a substring from a string.
Classification: POSIX and Perl
POSIX is easier to master, but cannot be used in binary mode. perl is relatively complex.
2. POSIX Regular Expression
1. Write Regular Expressions
Table 4.3 POSIX regular expression syntax format list

Character

Description

\

Escape Character, used to escape special characters. For example, '.' matches a single character, and '\.' matches a dot. '\-' Match the hyphen '-', '\' match the symbol '\'

^

Matches the start position of the input string. For example, '^ hes' indicates a string starting with' he '.

$

Matches the end position of the input string. For example, 'OK $' indicates the string ending with 'OK '.

*

Matches the previous subexpression zero or multiple times. For example, 'Zo * 'can match "z" and "zoo ". * Equivalent to {0 ,}

+

Match the previous subexpression once or multiple times. For example, 'Zo + 'can match "zo" and "zoo", but cannot match "z ". + Equivalent to {1 ,}

?

Match the previous subexpression zero or once. For example, 'Do (es )? 'Can match "do" in "do" or "does ". '? 'Is equivalent to {0, 1}

{N}

NIs a non-negative integer. MatchedNTimes. For example, 'O {2} 'cannot match 'O' in "Bob", but can match two 'O' in "food'

{N,}

NIs a non-negative integer. At least matchNTimes. For example, 'O {2,} 'cannot match 'O' in "Bob", but can match all 'O' in "foooood '. 'O {1,} 'is equivalent to 'o + '. 'O {0,} 'is equivalent to 'o *'

{N,M}

MAndNAll are non-negative integers, whereNM. Least matchNTimes and most matchingMTimes. For example, "o {1, 3}" matches the first three 'O' in "fooooood '. 'O {0, 1} 'is equivalent to 'o? '. Note that there must be no space between a comma and two numbers.

?

When this character is followed by any other delimiter (*, + ,?, The matching mode after {n}, {n ,}, {n, m}) is not greedy. The non-Greedy mode matches as few searched strings as possible, while the default greedy mode matches as many searched strings as possible. For example, for strings "oooo", 'O ++? 'Will match a single "o", and 'O +' will match all 'O'

.

Match any single character except "\ n". To match any character including '\ n', use the' [. \ n] 'mode.

(Pattern)

Match pattern and obtain this match. Save the obtained match to the corresponding array. To match parentheses, use '\ (' or '\)'

(? : Pattern)

Matches pattern but does not get the matching result. That is to say, this is a non-get match and is not stored. This is useful when "or" | "is used to combine various parts of a mode. For example, 'industr (? : Y | ies). It is a simpler expression than 'industry | industries '.

(? = Pattern)

Forward pre-query: matches the search string at the beginning of any string that matches the pattern. This is a non-get match, that is, the match does not need to be obtained for future use. For example, 'windows (? = 95 | 98 | NT | 2000) 'can match "Windows" in "Windows 2000", but cannot match "Windows" in "Windows 3.1 ". Pre-query does not consume characters. That is to say, after a match occurs, the next matching search starts immediately after the last match, instead of starting after the pre-query characters.

(?! Pattern)

Negative pre-query: matches the search string at the beginning of any string that does not match pattern. This is a non-get match, that is, the match does not need to be obtained for future use. For example, 'windows (?! 95 | 98 | NT | 2000) 'can match "Windows" in "'windows 3.1", but cannot match "Windows" in "Windows 2000 ". Pre-query does not consume characters. That is to say, after a match occurs, the next matching search starts immediately after the last match, instead of starting after the pre-query characters.

X | y

Match x or y. For example, if 'z | food' matches "z" or "food", '(z | f) ood' matches "zood" or "food"

[Xyz]

Character Set combination. Match any character in it. For example, '[abc]' can match 'A' in "plain'

[^ Xyz]

Negative value character set combination. Match any character not included. For example, '[^ abc]' can match 'p' in "plain'

[A-z]

Character range. Matches any character in the specified range. For example, '[a-z]' can match any lowercase letter in the range of 'A' to 'Z '.

[^ A-z]

Negative character range. Match any character that is not within the specified range. For example, '[^ a-z]' can match any character that is not in the range of 'A' to 'Z '.

Here are some examples of simple regular expressions:
● '[A-Za-z0-9]': indicates all uppercase letters, lowercase letters, and numbers ranging from 0 to 9.
● '^ Hello': A string starting with "hello.
● 'World $ ': A string ending with 'World.
● '. At': a string that starts with any single character except "\ n" and ends with "at", such as "cat" and "nat.
● '^ [A-zA-Z]': a string starting with a letter.
● 'Hi {2} ': indicates the letter h followed by two I, namely, hii.
● '(Go) +': indicates a string containing at least one 'Go' string, such as 'gogogo'
The ID card number is generally composed of 18 or 17 digits followed by an X or Y letter. to match the ID card number, you can write:
^ [0-9] {17} ([0-9] | X | Y) $
The Regular Expression of the Email address can be written as follows:
^ [A-zA-Z0-9 \-] + @ [a-zA-Z0-9 \-] + \. [a-zA-Z0-9 \-\.] + $
2. String Matching
Ereg () and eregi () Functions
You can use the ereg () function to find matching conditions between a string and a sub-string, return the length of the matching string, and return an array of matching characters with parameters. The syntax format is as follows:
Int ereg (string ($ pattern), string $ string [, array $ regs])
Copy codeThe Code is as follows:
<? Php
/* This example checks whether the string is a date in ISO format (YYYY-MM-DD )*/
$ Date = "1988-08-09 ";
$ Len = ereg ('([0-9] {4})-([0-9] {1, 2})-([0-9] {1, 2 })', $ date, $ regs); // The date format is YYYY-MM-DD
If ($ len)
{
Echo "$ regs [3]. $ regs [2]. $ regs [1]". "<br>"; // output "09.08.1988"
Echo $ regs [0]. "<br>"; // output "1988-08-09"
Echo $ len; // output 10
}
Else
{
Echo "incorrect date format: $ date ";
}
?>

3. String replacement
The syntax of the ereg_replace () function is as follows:
String ereg_replace (string $ pattern, string $ replacement, string $ string)
Note: The function uses the string $ replacement to replace the $ string that matches $ pattern and returns the replaced string. If no match is found, return as is
Copy codeThe Code is as follows:
<? Php
$ Str = "hello world ";
Echo ereg_replace ('[aeo]', 'x', $ str). "<br>"; // output 'hxllx wxrld'
$ Res = '<a href = \ "hello. php \"> hello </a> ';
Echo ereg_replace ('hello', $ res, $ str); // replace 'hello' with a hyperlink'
?>

4. Split the Array

You can use the split () function to perform the same functions as the explode () function. You can also use the regular expression to split the string and return an array. The syntax format is as follows:

Array split (string $ pattern, string $ string [, int $ limit])

5. Generate a regular expression

3. Perl-Compatible Regular Expressions

1. Write Regular Expressions

Table 4.4 Perl is compatible with the syntax format expanded by regular expressions

Character

Description

\ B

Match A Word boundary, that is, the position between a word and a space. For example, 'er \ B 'can match 'er' in "never", but cannot match 'er 'in "verb'

\ B

Match non-word boundary. 'Er \ B 'can match 'er' in "verb", but cannot match 'er 'in "never'

\ Cx

Match the control characters specified by x. For example, '\ cM' matches a Control-M or carriage return character. The value of x must be ~ Z or ~ Z. Otherwise, 'C' is treated as an original 'C' character.

\ D

Match a numeric character. It is equivalent to '[0-9]'.

\ D

Match a non-numeric character. It is equivalent to '[^ 0-9]'.

\ F

Match a form feed. It is equivalent to '\ x0c' and' \ cL'

\ N

Match A linefeed. It is equivalent to '\ x0a' and' \ cJ'

\ R

Match a carriage return. It is equivalent to '\ x0d' and' \ cM'

\ S

Matches any blank characters, including spaces, tabs, and page breaks. It is equivalent to '[\ f \ n \ r \ t \ v]'

\ S

Match any non-blank characters. It is equivalent to '[^ \ f \ n \ r \ t \ v]'

\ T

Match a tab. It is equivalent to '\ x09' and' \ ci'

\ V

Match a vertical tab. It is equivalent to '\ x0b' and' \ ck'

\ W

Match any word characters that contain underscores. Equivalent to '[A-Za-z0-9 _]'

\ W

Match any non-word character, equivalent to '[^ A-Za-z0-9 _]'

\ Xn

Match n, where n is the hexadecimal escape value. The hexadecimal escape value must be determined by the length of two numbers. For example, '\ x41' matches "". '\ X041' is equivalent to '\ x04' & "1 ". ASCII encoding can be used in regular expressions.

\ Num

Matches num, where num is a positive integer. References to the obtained matching. For example, '(.) \ 1' matches two consecutive identical characters.

\ N

Indicates an octal escape value or a backward reference. If at least n obtained subexpressions exist before \ n, n is a backward reference. Otherwise, if n is an octal digit (0 ~ 7), then n is an octal escape value.

\ Nm

Indicates an octal escape value or a backward reference. If at least one obtained sub-expression is prior to \ nm, the nm is a back reference. If at least n records are obtained before \ nm, n is a backward reference followed by text m. If none of the preceding conditions are met, if n and m are octal digits (0 ~ 7), then \ nm will match the octal escape value nm

\ Nml

If n is an octal digit (0 ~ 3), and both m and l are Octal numbers (0 ~ 7), match the octal escape value nml

\ Un

Match n, where n is a Unicode character represented by 4 hexadecimal numbers. For example, '\ u00A9' matches the copyright symbol (©)

2. String Matching
The preg_match () function searches strings. The syntax format is as follows:
Int preg_match (string $ pattern, string $ subject [, array $ matches [, int $ flags])
Note: The structure of this function is similar to that of the ereg () function. In the $ subject string, search for the content that matches the regular expression given by $ pattern.
The preg_match () function returns the number of times $ pattern matches. If it is not 0 times (no match), it is 1 time, because the preg_match () function will stop searching after the first match.
Another one is preg_match_all (), which starts from the end of the first match until the complete string is searched.
The value of $ flags in the preg_match_all () function parameter can be as follows:
● PREG_PATTERN_ORDER. The default value is $ matches [0], which is an array that matches all modes,
$ Matches [1] is an array consisting of strings matching the child pattern in the first parentheses, and so on.
● PREG_SET_ORDER. If this tag is set, $ matches [0] is the array of the first group of matching items, $ matches [1] is the array of the second group of matching items, and so on.
● PREG_OFFSET_CAPTURE. PREG_OFFSET_CAPTURE can be used in combination with the other two tags,
If this flag is set, the offset of the affiliated string is also returned for each matching result.
3. String replacement
Use the preg_replace () function to perform the same functions as the ereg_replace () function, search for matched substrings In the strings, and replace the substrings with the specified strings.
The syntax format is as follows:
Mixed preg_replace (mixed $ pattern, mixed $ replacement, mixed $ subject [, int $ limit])
4. string segmentation
The preg_split () function can use a regular expression as the boundary to split a string and store the substring into an array for return. This function is similar to the split () function.
The syntax format is as follows:
Array preg_split (string $ pattern, string $ subject [, int $ limit [, int $ flags])
Note: This function is case sensitive and returns an array containing substrings separated by the boundary matching $ pattern in $ subject.
$ Limit is an optional parameter. If it is specified, a maximum of $ limit strings are returned. If it is omitted or-1, there is no limit.
The value of $ flags can be of the following three types:
● PREG_SPLIT_NO_EMPTY. If this flag is set, the function returns only non-null strings.
● PREG_SPLIT_DELIM_CAPTURE. If this mark is set, the matching items of the brackets expression in the delimiter mode will also be captured and returned.
PREG_SPLIT_OFFSET_CAPTURE. If this flag is set, the offset of the affiliated string is also returned for each matching result.
4.3 instance-verify the Form Content
[Example 4.4] use a regular expression to verify whether the content of the form entered by the user meets the format requirements.
Create the EX4_4_Hpage.php file and enter the following code.
Copy codeThe Code is as follows:
<? Php
Include 'ex4 _ 4_Hpage.php '; // contains the file EX4_4Hpage.php
$ Id = $ _ POST ['id'];
$ Pwd = $ _ POST ['pwd'];
$ Phone = $ _ POST ['phone'];
$ Email = $ _ POST ['email '];
$ Checkid = preg_match ('/^ \ w {} $/', $ id); // check whether the string is within 10 Characters
$ Checkpwd = preg_match ('/^ \ d {4, 14} $/', $ pwd); // check whether the value is 4 ~ Between 14 digits
$ Checkphone = preg_match ('/^ 1 \ d {10} $/', $ phone); // check whether it is an 11-digit number starting with 1
// Check the validity of the Email address
$ CheckEmail = preg_match ('/^ [a-zA-Z0-9 _ \-] + @ [a-zA-Z0-9 \-] + \. [a-zA-Z0-9 \-\.] + $/', $ Email );
If ($ checkid & $ checkpwd & $ checkphone & $ checkEmail) // if both are 1, the registration is successful.
Echo "registration successful! ";
Else
Echo "registration failed, incorrect format ";
?>

Create the EX4_4_Ppage.php file and enter the following code:
2. String Matching
The preg_match () function searches strings. The syntax format is as follows:
Int preg_match (string $ pattern, string $ subject [, array $ matches [, int $ flags])
Note: The structure of this function is similar to that of the ereg () function. In the $ subject string, search for the content that matches the regular expression given by $ pattern.
The preg_match () function returns the number of times $ pattern matches. If it is not 0 times (no match), it is 1 time, because the preg_match () function will stop searching after the first match.
Another one is preg_match_all (), which starts from the end of the first match until the complete string is searched.
The value of $ flags in the preg_match_all () function parameter can be as follows:
● PREG_PATTERN_ORDER. The default value is $ matches [0], which is an array that matches all modes,
$ Matches [1] is an array consisting of strings matching the child pattern in the first parentheses, and so on.
● PREG_SET_ORDER. If this tag is set, $ matches [0] is the array of the first group of matching items, $ matches [1] is the array of the second group of matching items, and so on.
● PREG_OFFSET_CAPTURE. PREG_OFFSET_CAPTURE can be used in combination with the other two tags,
If this flag is set, the offset of the affiliated string is also returned for each matching result.
3. String replacement
Use the preg_replace () function to perform the same functions as the ereg_replace () function, search for matched substrings In the strings, and replace the substrings with the specified strings.
The syntax format is as follows:
Mixed preg_replace (mixed $ pattern, mixed $ replacement, mixed $ subject [, int $ limit])
4. string segmentation
The preg_split () function can use a regular expression as the boundary to split a string and store the substring into an array for return. This function is similar to the split () function.
The syntax format is as follows:
Array preg_split (string $ pattern, string $ subject [, int $ limit [, int $ flags])
Note: This function is case sensitive and returns an array containing substrings separated by the boundary matching $ pattern in $ subject.
$ Limit is an optional parameter. If it is specified, a maximum of $ limit strings are returned. If it is omitted or-1, there is no limit.
The value of $ flags can be of the following three types:
● PREG_SPLIT_NO_EMPTY. If this flag is set, the function returns only non-null strings.
● PREG_SPLIT_DELIM_CAPTURE. If this mark is set, the matching items of the brackets expression in the delimiter mode will also be captured and returned.
PREG_SPLIT_OFFSET_CAPTURE. If this flag is set, the offset of the affiliated string is also returned for each matching result.
4.3 instance-verify the Form Content
[Example 4.4] use a regular expression to verify whether the content of the form entered by the user meets the format requirements.
Create the EX4_4_Hpage.php file and enter the following code.
Copy codeThe Code is as follows:
<? Php
Include 'ex4 _ 4_Hpage.php '; // contains the file EX4_4Hpage.php
$ Id = $ _ POST ['id'];
$ Pwd = $ _ POST ['pwd'];
$ Phone = $ _ POST ['phone'];
$ Email = $ _ POST ['email '];
$ Checkid = preg_match ('/^ \ w {} $/', $ id); // check whether the string is within 10 Characters
$ Checkpwd = preg_match ('/^ \ d {4, 14} $/', $ pwd); // check whether the value is 4 ~ Between 14 digits
$ Checkphone = preg_match ('/^ 1 \ d {10} $/', $ phone); // check whether it is an 11-digit number starting with 1
// Check the validity of the Email address
$ CheckEmail = preg_match ('/^ [a-zA-Z0-9 _ \-] + @ [a-zA-Z0-9 \-] + \. [a-zA-Z0-9 \-\.] + $/', $ Email );
If ($ checkid & $ checkpwd & $ checkphone & $ checkEmail) // if both are 1, the registration is successful.
Echo "registration successful! ";
Else
Echo "registration failed, incorrect format ";
?>

Create the EX4_4_Ppage.php file and enter the following code:
Copy codeThe Code is as follows:
<? Php
Include 'ex4 _ 4_Hpage.php '; // contains the file EX4_4Hpage.php
$ Id = $ _ POST ['id'];
$ Pwd = $ _ POST ['pwd'];
$ Phone = $ _ POST ['phone'];
$ Email = $ _ POST ['email '];
$ Checkid = preg_match ('/^ \ w {} $/', $ id); // check whether the string is within 10 Characters
$ Checkpwd = preg_match ('/^ \ d {} $/', $ pwd); // check whether the value is between 4-14 characters
$ Checkphone = preg_match ('/^ 1 \ d {10} $/', $ phone); // check whether the 11-digit child starts with 1
// Check the validity of the Email address
$ CheckEmail = preg_match ('/^ [a-zA-Z0-9 _ \-] + @ [a-zA-Z0-9 \-] + \. [a-zA-Z0-9 \-\.] + $/', $ Email );
If ($ checkid & $ checkpwd & $ checkphone & $ checkEmail) // if both are 1, the registration is successful.
Echo "registration successful! ";
Else
Echo "registration failed, incorrect format ";
?>

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.