4th. Data processing-php Regular Expressions-Zheng Achi (cont.)

4th. Data processing-php Regular Expressions-Zheng Achi (cont.) _php Tutorial

Last Update:2016-07-21 Source: Internet

Author: User

Tags alphabetic character ereg posix

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

1. Basic knowledge of regular expressions
Meaning: A string pattern consisting of ordinary characters and (A-Z) and some special character
function: Validation.
Replace the text.
Extracts a substring from a string.
Category: POSIX and Perl
POSIX styles are easier to master, but cannot be used in binary mode, and Perl is relatively complex.
2.POSIX-style regular expressions
1. Writing Regular Expressions
Table 4.3 POSIX regular expression syntax format list

Characters	Description
\	An escape character used to escape a special character. For example, '. ' Match a single character, ' \. ' Matches a point number. ' \-' matches the hyphen '-', ' \ \ ' matches the symbol ' \ '
^	Matches the starting position of the input string. For example ' ^he ' means a string that begins with ' he '
$	Matches the end position of the input string. For example, ' ok$ ' means a string ending with ' OK '
*	Matches the preceding subexpression 0 or more times. For example, ' zo* ' can match "z" and "Zoo". * Equivalent to {0,}
+	Matches the preceding subexpression one or more times. For example, ' zo+ ' can match "Zo" and "Zoo", but not "Z". + equivalent to {1,}
?	Matches the preceding subexpression 0 or one time. For example, ' Do (es)? ' You can match "do" in "do" or "does". '?' Equivalent to {0,1}
{n}	N is a non-negative integer. Matches the determined n times. For example, ' o{2} ' cannot match ' o ' in ' Bob ', but can match two ' o ' in ' food '
{n,}	N is a non-negative integer. Match at least N times. For example, ' o{2,} ' cannot match ' o ' in ' Bob ', but can match all ' o ' in ' Foooood '. ' O{1,} ' is equivalent to ' o+ '. ' O{0,} ' is equivalent to ' o* '
{n,m}	Both m and n are non-negative integers, where n≤m. Matches at least N times and matches up to M times. For example, "o{1,3}" will match the first three ' o ' in ' Fooooood '. ' o{0,1} ' is equivalent to ' O? '. Note that there can be no spaces between the comma and two numbers
?	When the character immediately follows any other restriction (*, +,?, {n}, {n,}, {n,m}), the matching pattern is non-greedy. The non-greedy pattern matches the searched string as little as possible, while the default greedy pattern matches as much of the searched string as possible. For example, for the string "oooo", ' o+? ' Will match a single "O", while ' o+ ' will match all ' o '
.	Match any single character except "\ n" to match any character, including ' \ n ', using the ' [. \ n] ' mode
(pattern)	Match pattern and get this match. The obtained match is saved to the appropriate array. To match the parentheses character, use ' \ (' or ' \) '
(?:p Attern)	Matches pattern but does not get a matching result, which means that this is a non-fetch match and is not stored. This is in use "or" \| " is useful to combine parts of a pattern. For example, ' Industr (?: y\|ies). is a more abbreviated expression than ' industry\|industries '
(? =pattern)	Forward-checking matches the lookup string at the beginning of any string that matches the pattern. This is a non-fetch match, which means that the match does not need to be acquired for later use. For example, ' Windows (? =95\|98\| nt\|2000) ' Can match Windows 2000 ', but does not match Windows 3.1 in Windows. Pre-check does not consume characters, that is, after a match occurs, the next matching search starts immediately after the last match, rather than starting with the character that contains the pre-check
(?! Pattern	A negative pre-check matches the lookup string at the beginning of any string that does not match the pattern. This is a non-fetch match, which means that the match does not need to be acquired for later use. For example ' Windows (?! 95\|98\| nt\|2000) ' can match windows in ' Windows 3.1 ', but does not match Windows 2000 in Windows. Pre-check does not consume characters, that is, after a match occurs, the next matching search starts immediately after the last match, rather than starting with the character that contains the pre-check
X\|y	Match x or Y. For example, ' Z\|food ' can match ' z ' or ' food ', ' (z\|f) Ood ' matches ' zood ' or ' food '
[XYZ]	The character set is combined. Matches any one of the characters contained. For example, ' [ABC] ' can match ' a ' in ' plain '
[^XYZ]	Negative character set. Matches any character that is not contained. For example, ' [^ABC] ' can match ' P ' in ' plain '
[A-z]	The character range. Matches any character within the specified range. For example, ' [A-z] ' can match any lowercase alphabetic character in the ' a ' to ' Z ' range
[^a-z]	A negative character range. Matches any character that is not in the specified range. For example, ' [^a-z] ' can match any character that is not in the ' a ' to ' Z ' range

Here are some examples of simple regex expressions:
' [a-za-z0-9] ': All uppercase, lowercase, and 0 to 9 digits are indicated.
' ^hello ': Represents a string starting with Hello.
' world$ ': Represents a string ending in world.
'. At ': A string that starts with any single character except "\ n" and ends with "at", such as "Cat", "Nat", and so on.
' ^[a-za-z] ': Represents a string that begins with a letter.
' Hi{2} ': Represents the letter H followed by two I, or hii.
' (GO) + ': A string that contains at least one ' go ' string, such as ' Gogo '
The ID number usually consists of 18 digits or 17 digits followed by an X or Y letter, to match the ID number, you can write:
^[0-9]{17} ([0-9]| X| Y) $
The regular expression of an email address can be written:
^[a-za-z0-9\-]+@[a-za-z0-9\-]+\. [A-za-z0-9\-\.] +$
2. Matching of strings
Ereg () and eregi () functions
You can use the Ereg () function to find the case where a string matches a substring, return the length of the matched string, and return an array of matched characters with the help of a parameter. The syntax format is as follows:
int Ereg (String ($pattern), string $string [, array $regs])
Copy CodeThe code is as follows:
/* This example checks if the string is in ISO format date (YYYY-MM-DD) */
$date = "1988-08-09";
$len =ereg (' ([0-9]{4})-([0-9]{1,2})-([0-9]{1,2}) ', $date, $regs);//date format is YYYY-MM-DD
if ($len)
{
echo "$regs [3]. $regs [2]. $regs [1]. "
"; Output "09.08.1988"
echo $regs [0]. "
"; Output "1988-08-09"
Echo $len; Output 10
}
Else
{
echo "Wrong date format: $date";
}
?>

3. Substitution of strings
The syntax format for the ereg_replace () function is as follows:
String Ereg_replace (String $pattern, String $replacement, String $string)
Description: The function uses the string $replacement to replace the part of the string $string with $pattern, and returns the replaced string. If no match is found, it is returned as-is
Copy CodeThe code is as follows:
$str = "Hello World";
Echo ereg_replace (' [Aeo] ', ' x ', $str). "
"; Output ' Hxllx Wxrld '
$res = ' Hello ';
echo ereg_replace (' Hello ', $res, $STR); Replace ' Hello ' with a super-link
?>

4. Splitting an array

You can use the split () function to accomplish the same functions as the explode () function, and you can split the string according to the regular expression given, and return an array. The syntax format is as follows:

Array split (string $pattern, string $string [, int $limit])

5. Generating Regular Expressions

3.PERL-compliant regular expressions

1. Writing Regular Expressions

Table 4.4 Syntax format for Perl-compatible regular expression extensions

Characters	Description
\b	Matches a word boundary, which is the position between a word and a space. For example, ' er\b ' can match ' er ' in ' never ', but not ' er ' in ' verb '
\b	Matches a non-word boundary. ' er\b ' can match ' er ' in ' verb ', but cannot match ' er ' in ' Never '
\cx	Matches the control character indicated by X. For example, ' \cm ' matches a control-m or carriage return. The value of x must be one of a~z or a~z. Otherwise, ' C ' is treated as a literal ' C ' character
\d	Matches a numeric character. Equivalent to ' [0-9] '
\d	Matches a non-numeric character. Equivalent to ' [^0-9] '
\f	Matches a page break. Equivalent to ' \x0c ' and ' \cl '
\ n	Matches a line break. Equivalent to ' \x0a ' and ' \CJ '
\ r	Matches a carriage return character. Equivalent to ' \x0d ' and ' \cm '
\s	Matches any whitespace character, including spaces, tabs, page breaks, and so on. Equivalent to ' [\f\n\r\t\v] '
\s	Matches any non-whitespace character. Equivalent to ' [^ \f\n\r\t\v] '
\ t	Matches a tab character. Equivalent to ' \x09 ' and ' \ci '
\v	Matches a vertical tab. Equivalent to ' \x0b ' and ' \ck '
\w	Matches any word character that includes an underscore. Equivalent to ' [a-za-z0-9_] '
\w	Matches any non-word character, equivalent to ' [^a-za-z0-9_] '
\xn	Match N, where n is the hexadecimal escape value. The hexadecimal escape value must be two digits long for a determination. For example, ' \x41 ' matches ' A '. ' \x041 ' is equivalent to ' \x04 ' & ' 1 '. ASCII encoding can be used in regular expressions
\num	Matches num, where num is a positive integer. A reference to the obtained match. For example, ' (.) \1 ' matches two consecutive identical characters
\ n	Flag an octal escape value or a back reference. N is a back reference if there are at least n obtained sub-expressions before \ nthe. Otherwise, N is an octal escape value if n is the octal number (0~7)
\nm	Flag an octal escape value or a back reference. If there are at least NM obtained subexpression before \nm, then NM is a back reference. If there are at least N fetches before \nm, then N is a back reference followed by the literal m. If none of the preceding conditions are met, if both N and M are octal digits (0~7), then \nm will match the octal escape value NM
\nml	If n is an octal number (0~3) and both M and L are octal digits (0~7), the octal escape value is matched NML
\un	Matches n, where N is a Unicode character represented by 4 hexadecimal digits. For example, ' \u00a9 ' matches the copyright symbol (©)

2. String matching
The Preg_match () function makes a string lookup in the following syntax format:
int Preg_match (string $pattern, String $subject [, array $matches [, int $flags]])
Description: The structure of the function is similar to the Ereg () function, and searches the $subject string for content that matches the regular expression given by the $pattern.
The Preg_match () function returns the number of times that $pattern matches. Not 0 times (no match) is 1 times because the Preg_match () function stops searching after the first match
Another is Preg_match_all (), which continues the search from the end of the first match until the entire string has been searched.
The value of the Preg_match_all () function parameter $flags can take the following three types:
Preg_pattern_order. The default entry, which represents $matches[0] is an array of all pattern matches,
$matches [1] is an array of strings that match the sub-patterns in the first parenthesis, and so on.
Preg_set_order. If this tag is set, $matches[0] is an array of the first set of occurrences, $matches [1] is an array of the second set of occurrences, and so on.
Preg_offset_capture. Preg_offset_capture can be combined with two other tags,
If this tag is set, the matching result for each occurrence also returns its subordinate string offset.
3. Substitution of strings
Use the Preg_replace () function to perform the same function as the function ereg_replace (), find the matched substring in the string, and replace the substring with the specified string.
The syntax format is as follows:
Mixed preg_replace (mixed $pattern, mixed $replacement, mixed $subject [, int $limit])
4. Segmentation of Strings
The Preg_split () function can use a regular expression as a boundary to split a string and return the substring to an array, similar to the split () function.
The syntax format is as follows:
Array Preg_split (String $pattern, string $subject [, int $limit [, int $flags]])
Description: This function is case-sensitive and returns an array containing substrings that are split along the boundary that matches the $pattern in $subject.
The $limit is an optional parameter and, if specified, returns a maximum of $limit strings, if omitted or 1, there is no limit.
The $flags value can be the following three types:
Preg_split_no_empty. If this tag is set, the function returns only non-empty strings.
Preg_split_delim_capture. If this tag is set, a match for the parentheses expression in the delimiter pattern is also captured and returned.
Preg_split_offset_capture. If this tag is set, the matching result for each occurrence also returns its subordinate string offset.
4.3 Example-Validating form contents
Example 4.4 uses a regular expression to verify that the form content entered by the user satisfies the formatting requirements.
Create a new ex4_4_hpage.php file and enter the following code.
Copy CodeThe code is as follows:
Include ' ex4_4_hpage.php '; Include File ex4_4hpage.php
$id =$_post[' id '];
$pwd =$_post[' pwd '];
$phone =$_post[' phone '];
$Email =$_post[' Email '];
$checkid =preg_match ('/^\w{1,10}$/', $id); Checks if a string is within 10 characters
$checkpwd =preg_match ('/^\d{4,14}$/', $pwd); Check to see if there is a number between 4~14
$checkphone =preg_match ('/^1\d{10}$/', $phone); Check if it is a 11-digit number starting with 1
Check the legality of email address
$checkEmail =preg_match ('/^[a-za-z0-9_\-]+@[a-za-z0-9\-]+\.[ A-za-z0-9\-\.] +$/', $Email);
if ($checkid && $checkpwd && $checkphone && $checkEmail)//If both are 1, registration is successful
echo "Registration is successful! ";
Else
echo "Failed to register, the format is not correct";
?>

Create a new ex4_4_ppage.php file and enter the following code:
2. String matching
The Preg_match () function makes a string lookup in the following syntax format:
int Preg_match (string $pattern, String $subject [, array $matches [, int $flags]])
Description: The structure of the function is similar to the Ereg () function, and searches the $subject string for content that matches the regular expression given by the $pattern.
The Preg_match () function returns the number of times that $pattern matches. Not 0 times (no match) is 1 times because the Preg_match () function stops searching after the first match
Another is Preg_match_all (), which continues the search from the end of the first match until the entire string has been searched.
The value of the Preg_match_all () function parameter $flags can take the following three types:
Preg_pattern_order. The default entry, which represents $matches[0] is an array of all pattern matches,
$matches [1] is an array of strings that match the sub-patterns in the first parenthesis, and so on.
Preg_set_order. If this tag is set, $matches[0] is an array of the first set of occurrences, $matches [1] is an array of the second set of occurrences, and so on.
Preg_offset_capture. Preg_offset_capture can be combined with two other tags,
If this tag is set, the matching result for each occurrence also returns its subordinate string offset.
3. Substitution of strings
Use the Preg_replace () function to perform the same function as the function ereg_replace (), find the matched substring in the string, and replace the substring with the specified string.
The syntax format is as follows:
Mixed preg_replace (mixed $pattern, mixed $replacement, mixed $subject [, int $limit])
4. Segmentation of Strings
The Preg_split () function can use a regular expression as a boundary to split a string and return the substring to an array, similar to the split () function.
The syntax format is as follows:
Array Preg_split (String $pattern, string $subject [, int $limit [, int $flags]])
Description: This function is case-sensitive and returns an array containing substrings that are split along the boundary that matches the $pattern in $subject.
The $limit is an optional parameter and, if specified, returns a maximum of $limit strings, if omitted or 1, there is no limit.
The $flags value can be the following three types:
Preg_split_no_empty. If this tag is set, the function returns only non-empty strings.
Preg_split_delim_capture. If this tag is set, a match for the parentheses expression in the delimiter pattern is also captured and returned.
Preg_split_offset_capture. If this tag is set, the matching result for each occurrence also returns its subordinate string offset.
4.3 Example-Validating form contents
Example 4.4 uses a regular expression to verify that the form content entered by the user satisfies the formatting requirements.
Create a new ex4_4_hpage.php file and enter the following code.
Copy CodeThe code is as follows:
Include ' ex4_4_hpage.php '; Include File ex4_4hpage.php
$id =$_post[' id '];
$pwd =$_post[' pwd '];
$phone =$_post[' phone '];
$Email =$_post[' Email '];
$checkid =preg_match ('/^\w{1,10}$/', $id); Checks if a string is within 10 characters
$checkpwd =preg_match ('/^\d{4,14}$/', $pwd); Check to see if there is a number between 4~14
$checkphone =preg_match ('/^1\d{10}$/', $phone); Check if it is a 11-digit number starting with 1
Check the legality of email address
$checkEmail =preg_match ('/^[a-za-z0-9_\-]+@[a-za-z0-9\-]+\.[ A-za-z0-9\-\.] +$/', $Email);
if ($checkid && $checkpwd && $checkphone && $checkEmail)//If both are 1, registration is successful
echo "Registration is successful! ";
Else
echo "Failed to register, the format is not correct";
?>

Create a new ex4_4_ppage.php file and enter the following code:
Copy CodeThe code is as follows:
Include ' ex4_4_hpage.php '; Include File ex4_4hpage.php
$id =$_post[' id '];
$pwd =$_post[' pwd '];
$phone =$_post[' phone '];
$Email =$_post[' Email '];
$checkid =preg_match ('/^\w{1,10}$/', $id); Checks if a string is within 10 characters
$checkpwd =preg_match ('/^\d{4,14}$/', $pwd); Check if it's between 4-14 characters
$checkphone =preg_match ('/^1\d{10}$/', $phone); Check if it is a 11-digit number that starts with 1
Check the legality of email address
$checkEmail =preg_match ('/^[a-za-z0-9_\-]+@[a-za-z0-9\-]+\.[ A-za-z0-9\-\.] +$/', $Email);
if ($checkid && $checkpwd && $checkphone && $checkEmail)//If both are 1, registration is successful
echo "Registration is successful! ";
Else
echo "Failed to register, the format is not correct";
?>

http://www.bkjia.com/PHPjc/323889.html www.bkjia.com true http://www.bkjia.com/PHPjc/323889.html techarticle 1. The basic meaning of regular expressions: the function of string patterns consisting of ordinary characters and (A-Z) and some special character: validation. Replace the text. Extract a single string from a ...



This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More