Basic knowledge of regular expressions--regular expressions

Source: Internet
Author: User
Tags character classes class definition lowercase
Basic knowledge of regular expressions:

Metacharacters

The power of regular expressions is the ability to include selections and loops in the schema. They are used by using the

There are two different sets of metacharacters: one that can be identified in the pattern except in square brackets, and one that is identified within square brackets. The metacharacters outside the square brackets have these:

A common escape character for several purposes
Asserts the beginning of the target (or at the beginning of the line in multiline mode, followed by a newline character)
Asserts the end of the target (or the end of the line in multiline mode, preceded by a newline character)
Matches any character except the newline character (by default)
Character class definition begins
End of character class definition
Start a more Selective branch
Child mode start
Child mode End
Extension (meaning, is also a 0 or 1 quantity qualifier, and a quantity qualifier minimum value
Match 0 or more quantity qualifiers
Match 1 or more quantity qualifiers
Minimum/maximum number limit start
Minimum/maximum number limit end
The part of the pattern in parentheses is called a "character class." The metacharacters available in the character class are:

Universal Escape characters
Excludes character classes, but is valid only if they are the first character
Indicates a range of characters
End Character class
For more detailed instructions on how to use each meta character, refer to the section in the PHP Manual: pattern syntax.

See some examples of the algorithm: (citation: http://php.mydict.com/ziliao/4/15/2006_06/PHPZhongDeZhengZeBiaoDaShi3539_1.html)

The  special character "^" is used to match a string that begins with the specified string. For example:

 "^hello": This pattern and the string "hello,php world!" Match, but does not match "Say hello to".

The  special character "$" is used to match a string ending with the specified string. For example:

 "you$": This mode and "How to Are You" match, and "your" does not match.

 when special characters "^" and "$" are used at the same time, they represent exact matches. For example:

 "^hello$": This pattern matches only the string "Hello".

 If a pattern does not include "^" and "$", it matches any string that contains the pattern. For example:  "You": With the string "What is your name?" is a match.

 the letters in the pattern are just plain characters, and the numbers are the same.

 If you want to use some of the other slightly more complex characters, such as punctuation and whitespace characters (such as spaces, tabs, and so on), you'll need to use an escape sequence. All escape sequences begin with a backslash ("\"), such as a tab's escape sequence: "\ T". So if we're going to detect whether a string starts with tabs, you can use this pattern:

 "^\t"

 similar, "\ n" means a newline, "\ r" means carriage return, and the backslash itself is "\ \", period "." With "\." Represents, and so forth.

How do  use character clusters?

 if you want to determine whether the user's phone number, address, email address, credit card number, and so on are valid, it is not enough to use a normal literal string comparison. So in a better way to describe the pattern we want, this is the character cluster.

 For example, to create a character cluster that represents all vowel characters, you can do this:


 "[Aaeeiioouu]": This pattern matches any vowel character, but can only represent one character.

 a special symbol "-" can represent a range of characters, such as:

"[A-Z]"//match letter A-Z, that is, all lowercase letters
"[A-Z]"//match letter A-Z, that is, all uppercase letters
"[A-za-z]"//Match all the letters
[0-9] "//Match all numbers
"[0-9\.\-]"//Match all numbers, and periods and minus signs
"[\f\r\t\n]"//Match all white characters


 the same, these also match only one character.

 if you want to match a string of lowercase letters and one digit, such as "A4", "B5" or "F1", but not "aa4", "b5a4" or "F12", use this pattern:

 "^[a-z][0-9]$"

 Although [A-z] represents a 26-letter range, it can only match a string with the first character being a lowercase letter.

 we already know that "^" denotes the beginning of a string, but when "^" is used in a set of square brackets, it means "non" or "exclude", which is often used to weed out a character. With the previous example, we require that the first character not be a number:  "^[^0-9][0-9]$"

 this pattern matches "A4", "B5" and "+2", but it does not match "12" or "66". Here are a few examples of excluding specific characters:

 "[^a-z]"/all characters except lowercase letters
 "[^\\\/\^]"/all characters except (\) (/) (^)
 "[^\" \] "/all characters except double quotes (") and single quotes (')

 special character "." (point, period) is used in regular expressions to match all characters except "line wrap". So the pattern "^.5$" matches any two-character string that ends with the number 5 and begins with a different newline character. Mode "." You can match any string except an empty string and a string that includes only one line wrap.

php's regular expressions have some built-in common character clusters, which are listed below:

Character cluster meaning
"[[: Alpha:]]" any letter
"[[:d Igit:]]" any number
"[[: Alnum:]]" Any letters and numbers
"[[: Space:]]" any white character
"[[: Upper:]]" Any capital letter
"[[: Lower:]]" Any lowercase letter
"[[:p UNCT:]]" any punctuation
"[[: Xdigit:]]" Any number 16, equivalent to [0-9a-fa-f]


How do  match recurring occurrences?


 In many cases, we may want to match a word or a set of numbers. A word consists of several letters, and a set of numbers consists of several single digits. We use a quot that follows a character or a cluster of characters. {} ' to determine the number of repetitions of previous content: Suppose X is a number, then {x} indicates that "the preceding character or character cluster only appears x times"; A number with a comma, {x,} means "x or more times before"; two comma-delimited numbers, {x,y} indicates " The previous content appears at least x times, but not more than y times.

Character cluster meaning
"^[a-za-z_]$" all the letters and underscores
"^[[:alpha:]]{3}$" all 3-letter words
"^a$" Letter A
"^a{4}$" is not a word that starts with a letter A and has 4-letter words, such as AAAA.
^a{2,4}$ "aa,aaa or AAAA
"^a{1,3}$" A,aa or AAA
"^a{2,}$" contains more than two strings of a, such as Aaa,aaaa,aaaaa
"^a{2,}" words starting with two A, such as: Aardvark and Aaab, but not Apple.
"A{2,}" contains two words of a, such as: Baad and AAA, but Nantucket not.
"\t{2}" two tabs
". {2} ' all two characters


 We can extend the pattern to more words or numbers:

"^[a-za-z0-9_]{1,}$" all strings that contain more than one letter, number, or underscore
All positive numbers for "^[0-9]{1,}$"
All integers for "^\-{0,1}[0-9]{1,}$"
"^\-{0,1}[0-9]{0,}\. {0,1} [0-9] {0,}$ all integers


 Last example we can consider: all with an optional minus sign (\-{0,1}), followed by 0 or more digits ([0-9]{0,}), and an optional decimal point (\.{ 0,1}) followed by 0 or more digits ([0-9]{0,}) and nothing else ($).

 special characters "?" is equal to "{0,1}", and they all represent: "0 or 1 previous content" or "previous content is optional." So:

 "^\-{0,1}[0-9]{0,}\. {0,1} [0-9] {0,}$ "

 can be simplified to:

^\-? [0-9] {0,}\.? [0-9] {0,}$

The  special character "*" is equal to "{0,}" and they all represent "0 or more preceding content". The character "+" is equal to {1}, representing "1 or more preceding content", so the above 4 examples can be written as follows:

"^[a-za-z0-9_]+$" all strings that contain more than one letter, number, or underscore
All positive numbers for "^[0-9]+$"
"^\-? [0-9]+$] all integers
"^\-? [0-9]*\.? [0-9]*$] All decimals

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.