Regular Expression 2-PHP source code

Source: Internet
Author: User
Tags control characters ereg
Ec (2); 5. & quot; [] & quot; square brackets (character cluster) Usage & nbsp; 1) [] match a character, start with ^ in [] to indicate taking non, that is, all the subsequent characters do not match. & Nbsp; Example 1: [a-zA-Z0-9] match all uppercase and lowercase letters and numbers. & Nbsp; Example 2: [ntrf] matches all null characters. & Nbsp; Example 3: [^ A-Z] does not match the uppercase letter script ec (2); script

5. "[]" square brackets (character clusters) Usage
1) [] matches a character. In [], the character starting with ^ indicates taking a non-occurrence, that is, all the subsequent characters do not match.
Example 1: [a-zA-Z0-9] match all uppercase and lowercase letters and numbers.
Example 2: [\ n \ t \ r \ f] matches all null characters.
Example 3: [^ A-Z] does not match uppercase letters.
Example 4: ^ [^ 0-9] matches a character or string that does not start with a number.
2) The special character "." (period) matches all characters except the new line. The pattern ^. abc $ matches any character ending with abc, but cannot match itself. Mode "." can match any string except the null string and a string with only one "New Line" character.
Example 1: '^. abc $'; matches all strings with abc at the end and does not match decimals (new rows). If abc is not matched.
Example 2: '.'; matches all strings, but does not match null values.
Example 3: '. abc'. It can match all strings containing abc, decimal places, and so on, provided that abc is not the first and abc is not matched.
Example 4: '. abc $'; matches all strings ending with abc, any decimal places, and does not match abc.
3) php provides built-in generic character clusters:
[[: Alpha:] Any letter
[[: Digit:] Any number
[[: Alnum:] Any letter or number
[[: Space:] any blank characters
[[: Upper:] Any uppercase letter
[[: Lower:] Any lowercase letter
[[: Punct:] any table Point Symbol
[[: Xdigit:] Any hexadecimal number
[[: Cntrl:] any character with an ASCII value less than 32
Note: The preceding character cluster has a feature. If the matched character or string contains this character, the matching is correct, no matter how the string is formed.
6. "{}" braces usage
1) square brackets can only match one character, while Multiple matching characters can only be implemented with {}: {} to determine the number of occurrences of the preceding content. {N} indicates n occurrences; {m, n} indicates m ~ N times, including m and n; {n,} indicates n times or more.
Example 1: ^ a {10 }$; matches aaaaaaaaaa.
Example 2: [0-9] {1 ,}$; match all> 0 values.
2) Relationship between "{}" and wildcard
? Equivalent to {0, 1} zero times or once
*... {0,} zero or countless times
+ ...... {1,} once or countless times
7. "()" Usage
The pattern enclosed by parentheses "()" indicates the child mode, for example, $ pattern = '([1-9] {1} [0-9] {3 }) -([0-1] {1} [1-2] {1})-([0-3] {1} ([0-9] | ))'; () Expanding is a sub-mode. () is equivalent to separating them and matching them separately without interfering with each other.
Ii. POSIX-style Regular Expression Functions
1. ereg
Ereg (pattern, string, [array $ regs]);
Eregi (pattern, string, [array $ regs]);
The ereg function finds the text that meets the pattern in string. If true is found, false is not found. If the third parameter $ regs exists, the found text will be placed in $ regs [0], and the regs array will store the results of child pattern matching expressed by parentheses at a time. $ Regs [1] stores the matching results of the first sub-mode. $ regs [2] is the second, and the order is from left to right, and so on. If no matching text is found, the value of the $ regs array will not be changed.
Note: If the matched text is found, no matter how many sub-modes are found> 9 or <9, ereg () will only change the value of the first 10 elements of the $ regs array. However, this does not affect the matching result of the function Pair Mode combination. Ereg always finishes matching first. If no matching text is found, it will be false. If yes, it will be true. If there is a sub-mode, it will gradually search for matched text in the string based on these sub-modes until the $ regs array is filled with 10 elements or all sub-modes are matched, if the sub-mode is less than 10, the remaining $ regs will be assigned a null value. In a word, the match is matched. $ regs is $ regs, and $ regs has only 10 values.
The eregi () function is basically the same as ereg (), but eregi is not case sensitive.
2. ereg_replace and eregi_replace
Ereg_replace (pattern, string replacement, string)
Eregi_replace (pattern, string replacement, string)
The text that meets the pattern in the string will be replaced with replacement. If the string contains text that matches pattern, the replaced value is returned. If no, the original string value is returned.
If the pattern contains a child pattern, the Child pattern can be retained without being replaced.
Example 1: the second sub-mode in pattern is not replaced. replacement can be written as follows: replacement \ 2. In this way, the string that matches the pattern will be replaced with replacement + pattern2, and pattern2 indicates the text that matches the second sub-pattern of the pattern in the text that matches the pattern. If "\ 0" is used, the entire matching text is retained. This feature allows you to insert text after a specific string.
Replacement must be a string type variable. If not, it is forcibly converted to a string type during replacement.
3. Use the split () function and spliti () function
Split (pattern, string, [int limit]);
Spliti (pattern, string, [int limit]);
Split splits string into several parts using the pattern defined by regular expression pattern as the separator. If the separator is successful, the returned values are arrays composed of the separated parts. If the separator fails, false is returned. Optional limit indicates the maximum number of segments. If the limit value is 5, the string is only divided into five parts even if there are more than five strings that match the pattern, the last part is the rest part after removing the first four parts from the string. There are only five elements in the returned value.
Iii. perl-style Regular Expressions and related functions
1. perl regular syntax
Perl separator, which can be "/","! "And "{}".
Example 1:/^ [^ 0-9]/! ^ [0-9]! All three {^ [0-9]} are the same.
In the delimiter, delimiter characters are special sensitive characters and must be escaped. If you use the separator "/" and the regular expression uses the "/", you must use "\/". If you use "/" and "! "No problem.
Example 2: // \/$ /! // $! The two are the same.
Example 3 :! ^ \! \! [0-9] $! /^ !! [0-9] $/both are the same
2. special characters in perl
Warning character with an ASCII value of 7 \
\ B word boundary
\ A is equivalent to the escape sign ("/").
\ B Non-word boundary
\ Cn control characters
\ D single digit
\ D single non-digit
\ S single blank
\ S single non-Blank
\ W single letter or underline
\ W single non-word characters (neither letters nor underscores)
\ Z starts matching from the end of the target string
3. Advanced features
1) or operation "| ":
For example! ^ Ex | em! The matching condition is a string starting with ex or em. It can also be written! ^ E (x | m )!.
Note: () The content in represents the sub-mode \
2) mode options after logical symbols
! Regular Expression! Logical options
A: Only the characters starting with the target string are matched.
E: This option allows the regular expression consisting of the Escape Character $ to match only the end character of the target string. If the m option is selected, the option is ignored.
U: This option disables searches with the maximum length. In general, search will try to find the longest matching string. For example, the matching result of the mode/a +/in the "caaaaab" string is "aaaaa", but the matching result of the mode/a +/U with this option is "".
S: learning the mode to improve the search speed.
I: This option is case-insensitive.
M: This option treats strings containing line breaks as multiple rows rather than one row. At this time, "$", "^" and other characters match each line break.
S: This option matches the period "." With the line break.
X: This option notifies the PHP interpreter to ignore non-escape space characters in the regular expression definition during analysis. In this way, spaces can be used in regular expressions to enhance readability, but escape characters must be used in expressions.
3) Extended Mode symbol.
(? # Comment) Add comment comments to enhance the readability of regular expressions.
(? = Pattern) specifies that the pattern value must be followed after the pattern.
(?! Pattern) specifies that the pattern value cannot be followed after the pattern.
(? N) define the mode option n inside the mode rather than at the end.
(? :) Characters are consumed, and matching results are not captured.
Example: echo ereg ("? : ^ A $ "," a "); // No output

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.