When you study the regular time to do the notes, in fact, is not difficult oh php_ regular expression

Source: Internet
Author: User
Tags control characters ereg lowercase
RT, can be fixed a lot of things oh, we learn to learn it

I. Regular expressions
1. Matching characters
1 Header Match "^": such as ^0754, matches only the string at the beginning of 0754
2 tail Match "$": such as 0754$, matching only strings ending with 0754
3 Whole word match: combine ^ and $, such as ^0754$, match 0754 strings
2. Escape characters
1) NULL character:
Line wrap \ n
Carriage return \ r
Box drawings \ t
2) Other characters:
"$" \$
"^" \^
"+" \+
"/" \/
3. Wildcard characters
1) * Number: Used to match whether the preceding character appears 0 or more times in the string.
Example 1: ' abc* ', matching all strings containing AB.
2) + Number: .......... ....... ....... ....... ... One or more times.
Example 2: ' abc+ ', matching all strings containing ABC.
3) No.: ... .......... ....... ....... ....... ... 0 times or one time.
Example 3: Matches only strings containing AB, ABC endings no longer contain C. If Abca,aabc,aaab can, but ABCC is not.
4. About escape character \$ and double, single quotes (PHP4 environment)
1) The regular expression itself is a string.
2 when the quotation mark contains $, there is a difference between the double and single quotation marks, the difference is as follows:
(1) When using single quotes, the interpreter assigns all characters in quotation marks, including $, to string variables intact.
(2) When using double quotation marks, the interpreter will enclose the "$" character in quotation marks and the subsequent legal characters (letters, numbers, underscores) are translated into variables until an illegal character is encountered character character that the variable name ends, and that the illegal character and subsequent characters are treated as generic words assignments to the string variable until the next "$" is encountered.
(3) Note: A single $ appears at the end of a double quotation mark, and the interpreter does not translate it into a variable when no longer has any characters behind it. There is no need to add escape \, of course, do not advocate.
(4) If the character to be matched has $, the regular expression cannot be defined with double quotes because the escape character \$ in a single, double quotation mark that means something different:
<1> double quotes, \$ and a single $ meaning are the same, both represent tail-matching characters, so c\$$=c\$=c\$\$=c$=c\$\$ in double quotes, \$ represents only one character "$" at any time, and the echo "c\$$" result is c$$ and \$ And a single $ (a single $ means that the $ cannot be the name of the variable that followed) is completely equivalent, is a trailing match, so in double quotes it is impossible to write the character "$" as a non-tail matching character, and it is for this reason that the definition of a regular expression in most cases that needs to match $ can only be used with '.
<2> single quotes, the meaning of \$ only represents the character "$", the tail match is $, regardless of whether there is a legal variable-name characters, in single quotes, \$ is actually two characters, if not used for a regular match will have no meaning, echo ' c\$$ ' result is still c\$$. is used as a regular expression, the \$ in single quotes represents the special character "$", and the trailing match is a separate $ character.
3 The end match of the regular expression "$" is the same as the definition of the variable:
Example 1: Defines the regular expression as ^ab$: $pattern = "^ab\$"; the escape character \$ the character $ in double quotes, and the result is ^ab$.
Example 2: As above, the use of $pattern= "^ab$" is obviously wrong, but since $ is at the tail, there are no other characters behind it, so it still applies.
Example 3: A regular expression ending with a character combination of C $: $pattern = ' c\$$ ';
Example 4: As above, $pattern = "c\$$"; the regular expression treats \$ as a trailing match, so it only matches the end of C.
5. "[]" square brackets (character cluster) usage
1) [] matches a character, in [] using the ^ beginning to denote the non, that is, all subsequent characters are not matched.
Example 1:[a-za-z0-9] matches all uppercase and lowercase letters and digits.
Example 2:[\n\t\r\f] matches all null characters.
Example 3:[^a-z] does not match uppercase letters.
Example 4:^[^0-9] matches a character or string that does not begin with a number
2 special character "." (period) matches all characters except "New line", the pattern ^.abc$ matches any character that ends with ABC, but does not match itself. Mode "." You can match any string, except for an empty string and a string with only one "new line" character.
Example 1: ' ^.abc$ ' matches all trailing strings containing ABC, does not match decimal (new line), when does not match ABC.
Example 2: '. '; Matches all strings, but does not match null values.
Example 3: '. abc '; match all strings with ABC, decimal, etc. all can be, if not ABC-led, does not match ABC.
Example 4: '. abc$ '; matches all strings ending with ABC, any decimal number, and so on, and does not match ABC.
3 PHP provides a built-in universal character cluster:
[[: Alpha:]] any letter
[[:d Igit:]] any number
[[: Alnum:]] any letter or number
[[: Space:]] any whitespace character
[[: Upper:]] Any capital letter
[[: Lower:]] any lowercase letter
[[:p UNCT:]] Any table-point symbol
[[: Xdigit:]] any hexadecimal digits
[[: Cntrl:]] Any character with an ASCII value less than 32
Note: The above character clusters have a feature that matches correctly, regardless of how the string is composed, as long as the character or string is matched.
6. "{}" curly braces usage
1 brackets can match only one character, and matching multiple characters can only be implemented with {}: {} to determine the number of occurrences of the preceding content. {n} indicates n times, {m,n} indicates the occurrence of m~n, including M and n times; {n,} indicates that n times or more n times appear.
case 1:^a{10}$; matching aaaaaaaaaa.
Example 2:[0-9]{1,}$; matches the number of all >0.
2 The relationship between "{}" and the wildcard character
? Equivalent to {0,1} 0 times or once
*  ..... {0,} 0 times or countless times
+  ..... {1,} once or countless times
7. "()" Usage
The pattern enclosed in the parentheses "()" represents the child mode, such as $pattern= ' ([1-9]{1}[0-9]{3})-([0-1]{1}[1-2]{1})-([0-3]{1} ([0-9]|)) ';() The extension is a child mode, () is equivalent to separate them, matching each other without interfering.
Two. POSIX-style regular expression functions
1.ereg
Ereg (Pattern,string,[array $regs]);
Eregi (Pattern,string,[array $regs]);
The Ereg function finds text in string that satisfies pattern mode, and if it finds true, false is not found. If there is a third argument $regs, the found text is placed in $regs[0], and the regs array matches the result of a child pattern expressed in parentheses at a time. $regs [1] holds the result that the first child pattern matches, $regs [2] is the second, in order from left to right, and so on. If no matching text is found, the value of the $regs array is not changed.
Note: If a matching text is found, the value of the first 10 elements of the $regs array is changed only by the number of >9 or <9,ereg () found in the child mode. However, this does not affect the matching result of the function's child mode combination. Ereg always match the first, if you do not find the matching text is false, found true. If you have a child mode, you will gradually find the matching text in the string based on these child patterns until the $regs array is filled with 10 elements or all the child modes are matched, and if the child mode is less than 10, the remaining $regs will be assigned null values. In a word, match-matching, $regs to $regs, $regs only 10 values.
The eregi () function is the same as the basic usage of ereg (), except that eregi is insensitive to capitalization.
2.ereg_replace and Eregi_replace
Ereg_replace (pattern,string replacement,string)
Eregi_replace (pattern,string replacement,string)
Text that satisfies pattern in string strings will be replaced with replacement. If there is a pattern-matching text in string, the value after the replacement is returned, and if not, the original string value is returned.
If the pattern contains child modes, the child mode can optionally be retained without being replaced.
The second child pattern in the example 1:pattern is not replaced, replacement can be written like this: Replacement\\2. The string that matches the pattern in string will be replaced with the text that REPLACEMENT+PATTERN2,PATTERN2 represents the second child of pattern in the text that matches pattern. If you use "\\0", the entire matching text is preserved. This feature enables you to insert text after a specific string.
Replacement must be a string type variable and, if not, will be cast to a string type when substituted.
3.split () function and Spliti () function usage
Split (Pattern,string,[int limit]);
Spliti (Pattern,string,[int limit]);
Split separates a string into several parts in a pattern defined by regular expression patterns as a separator. If the partition succeeds, the returned value is an array of each of the separated parts, and the failure returns false. An optional limit represents the maximum number of split blocks. If the limit is 5, then even if a string has >5, the pattern,string is only divided into 5 parts, and the last part is a string that removes the remainder of the first four sections. There are only 5 elements in the return value.
Three. Perl-style regular expressions and related functions
1.perl Regular syntax
Perl separator, you can use the "/", "!" and ' {} '.
Case 1:/^[^0-9]/!^[0-9]! {^[0-9]} three are the same.
Inside the delimiter, the separator character itself is a special sensitive character to be escaped. If you use the delimiter "/" and the "/" expression character is used in the regular, you must use "\/". If mixed with "/" and "!" There is no problem.
Case 2:/\/\/$/!//$! The two are the same
Case 3:!^\!\!    [0-9]$! /^!! [0-9]$/two are the same
2.perl Special meaning characters
\a ASCII value of 7 warning characters
The boundary of the \b Word
\a and protrusion symbols ("/") equivalence
\b The non-word boundary
\CN control characters
\d single digit
\d Single non-numeric
\s single whitespace
\s Single non-whitespace
\w Single letter or underline
\w single non-word characters (not letters or underscores)
\z from the tail of the target string
3. Advanced Features
1) or operation "|" :
For example, a!^ex|em! match condition is a string that begins with ex or EM, and can be written as!^e (X|M)!.
Note: the content within () represents the child mode \
2 mode options following the logical symbol
! Regular Expression! Logical options
A: Matches only the characters at the beginning of the target string.
E: This option causes the regular expression of escape character $ to match only the end character of the destination string. If you select the M option, this option is ignored.
U: This option prohibits searching for the maximum length. In general, the search will try to find the longest matching string. For example, the result of a pattern/a+/in the "Caaaaab" string is "AAAAA", but a pattern that uses this option/a+/u the result of a match would be "a".
S: Learn the pattern and improve the search speed.
I: This option ignores case.
M: This option treats strings that contain newline characters as multiple lines instead of one line. This time "$", "^" and other characters will match each line feed.
S: This option causes the period "." Also matches line breaks.
x: This option tells the PHP interpreter to ignore the non-escaped spaces in the regular expression definition when parsing. This allows you to use spaces in regular expressions to enhance readability, but you must use an escape character to use spaces in an expression.
3) Extended mode symbol.
(? #comment) Adds a comment comment to enhance regular readability.
(? =pattern) specifies that the value pattern must follow after the schema.
(?! pattern) Specifies that the value patterns cannot be followed after the schema.
(? n) defines the mode option n within the pattern, not at the end.
(?:) consumes characters and does not capture matching results.
For example: Echo Ereg ("?: ^a$", "a");//No output.
4.per Regular function
1.preg_grep function
Preg_grep (Pattern,array input);
The input array input searches for a string that matches pattern patterns and returns all matching strings. The return value is an array of all matched strings.
2.preg_match function
Preg_match (pattern,string Subject,[array matches])
The function finds a string in the subject string that matches the pattern. Returns a value other than 0 if found, otherwise returns a value of 0. If optional matches is selected, then the matching string is placed in the position of the first element and can be read with $matches[0], and the result of the parentheses match is placed in the array in order, the first is $matches[1], and the second is $matches[2], by analogy.
3.preg_match_all function
Preg_match_all (Pattern,subject,array matches,[int order])
The function searches the subject string for non-overlapping text that matches pattern, finds matching text, returns the number of matching text, or returns 0. The matching text is placed in a two-dimensional array matches, and Matches[0] holds all conforming strings. The results of various embedded child pattern matches are placed in the array matches[1]~[n] in turn.
The order parameter is optional, and the desirable value is Preg_pattern_order and Preg_set_order.
4.preg_replace function
Preg_replace (Pattern,replacement,subject,[int limit])
The function replaces the pattern-conforming part of the subject with the replacement, returns the value type and the subject type, returns the replacement value if there is a replacement, and returns the original value.
A parameter can be an array or a variable, in several cases:
<1> if the subject parameter is an array type. function to replace each array of elements;
<2> if pattern is an array, the function is substituted according to the type in each of the patterns;
<3> if both pattern and replacement are arrays, the substitution is done according to the elements in the two arrays;
<4> if the number of elements in the replacement is less than the number of elements in pattern. The part that is not enough will have an empty string instead.
5.preg_split function
Preg_split (Pattern,subject,[int limit][flages])
The function separates the subject string into several parts in a pattern-defined mode, and returns an array that holds the delimited string. Limit can limit the number of returned strings, and if set to-1 indicates no restriction on the number of strings returned. Flags are also optional and have two values: the Preg_split_no_empty function does not return an empty string, perg_split_delim_capture, which sets the embedded child mode in pattern to be matched by the function.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.