Special characters in regular expressions in php

Source: Internet
Author: User

Character/
Meaning: For characters, it usually indicates the literal meaning, indicating that the subsequent characters are special characters, not explained.
For example:/B/matches the character 'B'. By adding a backslash (/B/) before B, the character becomes a special character, indicating
Match the dividing line of a word.
Or:
For a few characters, it is generally described as special. It is pointed out that the subsequent characters are not special, but should be interpreted literally.
For example, * is a special character that matches any character (including 0 characters). For example,/a */indicates that it matches 0 or multiple a characters. To match the literal *, add a backslash before a. For example,/a */matches 'A *'.

Character ^
Meaning: The matched characters must be at the frontend.
For example,/^ A/does not match 'A' in "an A,", but matches 'A' in the top of "An '.

Character $
Meaning: similar to ^, it matches the last character.
For example,/t $/does not match 'T' in "eater", but matches 'T' in "eat '.

Character *
Meaning: match the first character of * 0 or n times.
For example,/bo */matches 'boooo' in "A ghost booooed" or 'B' in "A bird warbled", but does not match "Agoat g
Any character in runted.

Character +
Meaning: match the character before the plus sign once or n times. It is equivalent to {1 ,}.
For example,/a +/matches all 'A' in "candy" and "caaaaaaandy '.

Character?
Meaning: match? The first character is 0 or 1 time.
Example:/e? Le? /Match 'El' in "angel" and 'le' in "angle '.

Character.
Meaning: (decimal point) match all single characters except line breaks.
For example,/. n/matches 'any' and 'on' in "nay, an apple is on the tree", but does not match 'nay '.


Character (x)
Meaning: Match 'X' and record the matched value.
For example,/(foo)/matches and records 'foo' in "foo bar '. Matching substrings can be returned by the element [1],..., [n] In the result array.
Return, or be returned by RegExp object attributes.

Character x │ y
Meaning: Match 'X' or 'y '.
For example,/green │ red/matches 'green' in "green apple" and 'red' in "red apple '.

Character {n}
Meaning: Here n is a positive integer. Match the previous n characters.
For example:/a {2}/does not match 'A' in "candy,", but matches all 'A' and "caaandy" in "caandy. "The first two 'A '.

Character {n ,}
Meaning: Here n is a positive integer. Match at least n FIRST characters.
For example,/a {2,} does not match 'A' in "candy", but matches all 'A' in "caandy" and "caaaaaaandy'

Character {n, m}
Meaning: both n and m are positive integers. Match at least n characters at most before m.
For example,/a {}/does not match any character in "cndy", but matches the first two characters in "candy," 'A', "caandy ,"
'A' and "caaaaaaandy" are the first three 'A'. Note: Even if "caaaaaaandy" has many 'A ', but only match the first three 'A', that is, "aaa ".

Character [xyz]
Meaning: A one-character list that matches any character in the list. You can use a hyphen to indicate a character range.
For example, [abcd] is the same as [a-c. They match 'B' in "brisket" and 'C' in "ache '.

Character [^ xyz]
Meaning: A character complement, that is, it matches everything except the listed characters. You can use a hyphen to indicate the one-character range.
For example, [^ abc] is equivalent to [^ a-c]. They first match 'R' in "brisket" and 'H' in "chop '.

Character
Meaning: match a space (do not confuse with B)

Character B
Meaning: match the boundary of a word, such as a space (not to be confused)
For example,/bnw/matches 'no' in "noonday",/wyb/matches 'ly 'in "possibly yesterday '.

Character B
Meaning: match the non-dividing line of a word
For example,/wBn/matches 'on' in "noonday",/yBw/matches 'Ye 'in "possibly yesterday '.

Character cX
Meaning: X is a control character. Matches the control character of a string.
For example,/cM/matches control-M in a string.

Character d
Meaning: matching a number is equivalent to [0-9].
For example,/d/or/[0-9]/matches '2' in "B2 is the suite number '.

Character D
Meaning: match any non-number, which is equivalent to [^ 0-9].
For example,/D/or/[^ 0-9]/matches 'B' in "B2 is the suite number '.

Character f
Meaning: match a form character

CHARACTER n
Meaning: match a linefeed.

Character r
Meaning: match a carriage return.

Character s
Meaning: match a single white space character, including space, tab, form feed, line feed, equivalent to [fnrtv].
For example,/sw */matches 'bar' in "foo bar '.

Character S
Meaning: match a single character except the white space character, which is equivalent to [^ fnrtv].
For example,/S/w * matches 'foo' in "foo bar '.

Character t
Meaning: match a tab

Character v
Meaning: match a top Tab

Character w
Meaning: match all numbers, letters, and underscores, equivalent to [A-Za-z0-9 _].
For example,/w/matches "apple," 'A', ". 28," '5' and "3D." '3 '.

Character W
Meaning: match other characters except numbers, letters, and underscores, equivalent to [^ A-Za-z0-9 _].
For example:/W/or/[^ $ A-Za-z0-9 _]/matches '%' in "50% '.

CHARACTER n
Meaning: Here n is a positive integer. Match the n value of the last substring of a regular expression (left parentheses ).

For example:/apple (,) sorange1/matches 'apple, orange, cherry, peach. ". Here is a more complete example.
Note: If the number in the left parentheses is smaller than the number specified by n, n removes the octal escape of a row as the description.

Ooctal and xhex
Meaning: ooctal here is an escape value of octal, and xhex is a hexadecimal escape value, allowing ASCII code to be embedded in a regular expression

Appendix: The following table provides a complete list of metacharacters and their behavior in the context of a regular expression:

Character Description
\
Mark the next character as a special character, a literal character, a back reference, or an octal escape character. For example, 'n' matches the character "n ". '\ N' matches a line break. The sequence '\' matches "" and "\ (" matches "(".
^
Matches the start position of the input string. If the Multiline attribute of the RegExp object is set, ^ matches the position after '\ n' or' \ R.
$
Matches the end position of the input string. If the Multiline attribute of the RegExp object is set, $ also matches the position before '\ n' or' \ R.
*
Matches the previous subexpression zero or multiple times. For example, zo * can match "z" and "zoo ". * Is equivalent to {0 ,}.
+ Match the previous subexpression once or multiple times. For example, 'Zo + 'can match "zo" and "zoo", but cannot match "z ". + Is equivalent to {1 ,}.
?
Match the previous subexpression zero or once. For example, "do (es )? "Can match" do "in" do "or" does ".? It is equivalent to {0, 1 }.
{N}
N is a non-negative integer. Match n times. For example, 'O {2} 'cannot match 'O' in "Bob", but can match two o in "food.
{N ,}
N is a non-negative integer. Match at least n times. For example, 'O {2,} 'cannot match 'O' in "Bob", but can match all o in "foooood. 'O {1,} 'is equivalent to 'o + '. 'O {0,} 'is equivalent to 'o *'.
{N, m}
Both m and n are non-negative integers, where n <= m. Match at least n times and at most m times. Liu, "o {1, 3}" will match the first three o in "fooooood. 'O {0, 1} 'is equivalent to 'o? '. Note that there must be no space between a comma and two numbers.
?
When this character is followed by any other delimiter (*, + ,?, The matching mode after {n}, {n ,}, {n, m}) is not greedy. The non-Greedy mode matches as few searched strings as possible, while the default greedy mode matches as many searched strings as possible. For example, for strings "oooo", 'O ++? 'Will match a single "o", and 'O +' will match all 'O '.
.
Matches any single character except "\ n. To match any character including '\ n', use a pattern like' [. \ n.
(Pattern)
Match pattern and obtain this match. The obtained match can be obtained from the generated Matches set. The SubMatches set is used in VBScript and {CONTENT}… is used in JScript }... Attribute. To match the parentheses, use '\ (' or '\)'.
(? : Pattern)
Matches pattern but does not get the matching result. That is to say, this is a non-get match and is not stored for future use. This is useful when you use the "or" character (|) to combine each part of a pattern. For example, 'industr (? : Y | ies) is a simpler expression than 'industry | industries.
(? = Pattern)
Forward pre-query: matches the search string at the beginning of any string that matches the pattern. This is a non-get match, that is, the match does not need to be obtained for future use. For example, 'windows (? = 95 | 98 | NT | 2000) 'can match "Windows" in "Windows 2000", but cannot match "Windows" in "Windows 3.1 ". Pre-query does not consume characters, that is, after a match occurs, the next matching search starts immediately after the last match, instead of starting after the pre-query characters.
(?! Pattern)
Negative pre-query: match the search string at the start of any string that does not match Negative lookahead matches the search string at any point where a string not matching pattern. This is a non-get match, that is, the match does not need to be obtained for future use. For example, 'windows (?! 95 | 98 | NT | 2000) 'can match "Windows" in "Windows 3.1", but cannot match "Windows" in "Windows 2000 ". Pre-query does not consume characters. That is to say, after a match occurs, the next matching search starts immediately after the last match, instead of starting after the pre-query characters.
X | y
Match x or y. For example, 'z | food' can match "z" or "food ". '(Z | f) ood' matches "zood" or "food ".
[Xyz]
Character Set combination. Match any character in it. For example, '[abc]' can match 'A' in "plain '.
[^ Xyz]
Negative value character set combination. Match any character not included. For example, '[^ abc]' can match 'p' in "plain '.
[A-z]
Character range. Matches any character in the specified range. For example, '[a-z]' can match any lowercase letter in the range of 'A' to 'Z.
[^ A-z]
Negative character range. Matches any character that is not within the specified range. For example, '[^ a-z]' can match any character that is not in the range of 'A' to 'Z.
\ B
Match A Word boundary, that is, the position between a word and a space. For example, 'er \ B 'can match 'er' in "never", but cannot match 'er 'in "verb '.
\ B
Match non-word boundary. 'Er \ B 'can match 'er' in "verb", but cannot match 'er 'in "never '.
\ Cx
Match the control characters specified by x. For example, \ cM matches a Control-M or carriage return character. The value of x must be either a A-Z or a-z. Otherwise, c is treated as an original 'C' character.
\ D
Match a numeric character. It is equivalent to [0-9].
\ D
Match a non-numeric character. It is equivalent to [^ 0-9].
\ F
Match a form feed. It is equivalent to \ x0c and \ cL.
\ N
Match A linefeed. It is equivalent to \ x0a and \ cJ.
\ R
Match a carriage return. It is equivalent to \ x0d and \ cM.
\ S
Matches any blank characters, including spaces, tabs, and page breaks. It is equivalent to [\ f \ n \ r \ t \ v].
\ S
Match any non-blank characters. It is equivalent to [^ \ f \ n \ r \ t \ v].
\ T
Match a tab. It is equivalent to \ x09 and \ cI.
\ V
Match a vertical tab. It is equivalent to \ x0b and \ cK.
\ W
Match any word characters that contain underscores. It is equivalent to '[A-Za-z0-9 _]'.
\ W
Match any non-word characters. It is equivalent to '[^ A-Za-z0-9 _]'.
\ Xn
Match n, where n is the hexadecimal escape value. The hexadecimal escape value must be determined by the length of two numbers. For example, '\ x41' matches "". '\ X041' is equivalent to '\ x04' & "1 ". The regular expression can use ASCII encoding ..
\ Num
Matches num, where num is a positive integer. References to the obtained matching. For example, '(.)' matches two consecutive identical characters.
\ N
Identifies an octal escape value or a backward reference. If at least n subexpressions are obtained before \ n, n is a backward reference. Otherwise, if n is an octal digit (0-7), n is an octal escape value.
\ Nm
Identifies an octal escape value or a backward reference. If there are at least is preceded by at least nm obtained subexpressions before \ nm, then nm is backward reference. If at least n records are obtained before \ nm, n is a backward reference followed by text m. If none of the preceding conditions are met, if n and m are Octal numbers (0-7), \ nm matches the octal escape value nm.
\ Nml
If n is an octal number (0-3) and m and l are Octal numbers (0-7), the octal escape value nml is matched.
\ Un
Match n, where n is a Unicode character represented by four hexadecimal numbers. For example, \ u00A9 matches the copyright symbol (?).

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.