Regular Expressions in PHP (i)
Hunte April 14, 2000
PHP inherits *nix's consistent tradition and fully supports the processing of regular expressions. Formal expressions provide an advanced, but not intuitive, method for string matching and processing. Friends who have used Perl's regular expressions know that formal expressions are powerful, but not so easy to learn.
Like what:
^.+@.+\\.. +$
This effective but incomprehensible code is enough to make some programmers headache (I am) or let them give up using regular expressions. I believe that when you finish reading this tutorial, you can understand the meaning of this piece of code.
Basic Pattern Matching
Everything starts from the most basic. Patterns are the most basic elements of formal expressions, which are a set of characters that describe the character of a string. Patterns can be simple, consist of ordinary strings, or can be very complex, often with special characters representing a range of characters, repeating, or representing context. For example:
^once
This pattern contains a special character ^, which indicates that the pattern matches only those strings that begin with once. For example, the pattern matches the string "Once Upon a Time" and does not match "there once is a man from NewYork". Just as the ^ symbol represents the beginning, the $ symbol is used to match strings that end in a given pattern.
bucket$
This pattern matches the "who kept all of the cash in a bucket" and does not match "buckets". The characters ^ and $ are used together to indicate exact matches (the string is the same as the pattern). For example:
^bucket$
Matches only the string "bucket". If a pattern does not include ^ and $, then it matches any string that contains the pattern. Example: Mode
Once
With string
There once is a man from NewYork
Who kept all the cash in a bucket.
is a match.
The letters in the pattern (O-N-C-E) are literal characters, that is, they represent the letter itself, and the numbers are the same. Some other slightly more complex characters, such as punctuation and white characters (spaces, tabs, etc.), are used to escape sequences. All escape sequences begin with a backslash (\). The escape sequence for a tab is: \ t. So if we're going to check if a string starts with a tab, you can use this pattern:
^\t
Similarly, use \ n to indicate "new line" and \ r for carriage return. Other special symbols can be used in front with a backslash, such as the backslash itself with \ \, period. Use \. To indicate, and so on.
Character families
In programs in the Internet, regular expressions are often used to validate the user's input. When the user submits a form, to determine whether the input phone number, address, email address, credit card number, etc. is valid, with ordinary literal-based characters is not enough.
So to use a more liberal way of describing the pattern we want, it's a character cluster. To create a character cluster that represents all vowel characters, place all the vowels in a square bracket:
[Aaeeiioouu]
This pattern matches any vowel character, but can only represent one character. A hyphen can be used to represent a range of characters, such as:
[A-z]//Match all lowercase letters
[A-z]//Match all uppercase letters
[A-za-z]//Match all the letters
[0-9]//Match all the numbers
[0-9\.\-]//Match all numbers, periods and minus signs
[\f\r\t\n]//match all whitespace characters
Similarly, these also represent only one character, which is a very important one. If you want to match a string consisting of a lowercase letter and a single digit, such as "Z2", "T6" or "G7", but not "ab2", "r2d3", or "B52", use this pattern:
^[a-z][0-9]$
Although [A-z] represents a range of 26 letters, here it can only match a string with the first character being a lowercase letter.
The previous mention of ^ represents the beginning of a string, but it has another meaning. When used in a set of square brackets ^ is, it means "non" or "exclude" meaning, often used to remove a character. Also with the previous example, we require that the first character cannot be a number:
^[^0-9][0-9]$
This pattern matches "&5", "G7" and "2", but does not match "12" or "66". Here are a few examples of excluding specific characters:
[^a-z]//All characters except lowercase letters
[^\\\/\^]//all characters except (\) (/) (^)
[^\ "\ ']//all characters except double quotation marks (") and single quotation marks (')
Special character "." (point, period) is used in regular expressions to denote all characters except the "New line". So the pattern "^.5$" matches any two-character string that ends with the number 5 and begins with other non-"new line" characters. Mode "." You can match any string, except for an empty string, and to include only a "new line" of strings.
The regular expressions for PHP have some built-in universal character clusters, which are listed below:
Character cluster meaning
[[: Alpha:]] any letter
[[:d Igit:]] any number
[[: Alnum:]] Any letters and numbers
[[: Space:]] any whitespace character
[[: Upper:]] Any capital letter
[[: Lower:]] any lowercase letter
[[:p UNCT:]] any punctuation
[[: Xdigit:]] Any 16 binary number, equivalent to [0-9a-fa-f]
http://www.bkjia.com/PHPjc/316843.html www.bkjia.com true http://www.bkjia.com/PHPjc/316843.html techarticle Regular Expressions in PHP (a) Hunte April 14, 2000 PHP inherits *nix's consistent tradition and fully supports the processing of regular expressions. Regular expressions provide an advanced, but not intuitive ...