PHP regular expression _ PHP-php Tutorial

Source: Internet
Author: User
Tags php regular expression
Regular expressions are a powerful tool for pattern matching and replacement. Regular expressions allow users to construct matching modes by using a series of special characters, and then compare the matching modes with target objects such as data files, program input, and form input on the web page, execute the corresponding program based on whether the comparison object contains the matching mode. Regular expression

Summary

Regular expressions are powerful tools for pattern matching and replacement. Regular expressions allow users to construct matching modes by using a series of special characters, and then compare the matching modes with target objects such as data files, program input, and form input on the web page, execute the corresponding program based on whether the comparison object contains the matching mode.

How to use basic pattern matching?

Pattern is the most basic element of a regular expression. they are a set of characters that describe character strings. The mode can be very simple. it is composed of common strings and can be very complex. special characters are often used to indicate repeated characters in a range or context. Let's first look at some special characters in regular expressions.

The special character "^" is used to match a string starting with a specified string. For example:

"^ Hello": This mode and string "hello, PHP world! "Match, but it does not match" Say hello to you.

The special character "$" is used to match a string ending with a specified string. For example:

"You $": This mode matches "How are you" and does not match "your.

When the special characters "^" and "$" are used at the same time, it indicates exact match. For example:

"^ Hello $": this mode only matches the string "hello ".

If a mode does not include "^" and "$", it matches any string containing this mode. For example: "you": with the string "What is your name? "Is matched.

In this mode, letters are only common characters and numbers are the same.

If you want to use other slightly complex characters, such as punctuation marks and blank characters (such as spaces and tabs), you need to use escape sequences. All escape sequences start with a backslash ("\"). For example, the escape sequence of a tab character is "\ t ". So if we want to check whether a string starts with a tab character, we can use this mode:

"^ \ T"

Similarly, use "\ n" to indicate a line break, "\ r" to indicate a carriage return, and the backslash itself uses "\" to indicate a full stop ". "use "\. ", and so on.

How to use character clusters?

If you want to determine whether the phone number, address, EMAIL address, and credit card number entered by the user are valid, it is not enough to use a common string based on the literal. Therefore, we need to describe the desired mode in a better way. this is the character cluster.

For example, to create a character cluster that represents all vowel characters, you can do this:

"[AaEeIiOoUu]": This mode matches any vowel character, but can only represent one character.

The special symbol "-" can be used to indicate the range of a character, for example:

"[A-z]" // Match the letter a-z, that is, all lowercase letters
"[A-Z]" // Match the letter A-Z, that is, all uppercase letters
"[A-zA-Z]" // Match all letters
"[0-9]" // Match all numbers
"[0-9 \. \-]" // Match all numbers, and periods (periods) and minus signs
"[\ F \ r \ t \ n]" // Match all white characters

Similarly, these match only one character.

If you want to match a string consisting of a lowercase letter and a digit, such as "a4", "b5", or "f1 ", if it is not "aa4", "b5a4", or "f12", use this mode:

"^ [A-z] [0-9] $"

Although [a-z] represents the range of 26 letters, it can only match strings with lowercase letters with the first character.

We already know that "^" indicates the start of a string, but when "^" is used in a pair of brackets, it indicates "not" or "excluded, it is often used to remove a character. In the preceding example, the first character must not be a number: "^ [^ 0-9] [0-9] $"

This mode matches "a4", "b5", and "2", but does not match "12", "66. The following are examples of how to exclude specific characters:

"[^ A-z]" // All characters except lowercase letters
"[^ \/\ ^]" // All characters except (\) (/) (^)
"[^ \" \ '] "// All characters except double quotation marks (") and single quotation marks (')

The special character "." (Dot, English ending) is used in a regular expression to match all characters except "line breaks. Therefore, the pattern "^. 5 $" matches any two-character string that ends with a number 5 and starts with another non-"Newline" character. Mode "." can match any string, except empty strings and strings containing only one "Newline.

PHP regular expressions have some built-in general character clusters. the list is as follows:

Character cluster Description
"[[: Alpha:]" Any letter
"[[: Digit:]" Any number
"[[: Alnum:]" Any letter or number
"[[: Space:]" Any white characters
"[[: Upper:]" Any uppercase letter
"[[: Lower:]" Any lowercase letter
"[[: Punct:]" Any punctuation
"[[: Xdigit:]" Any hexadecimal number, equivalent to [0-9a-fA-F]

How do I match repeated occurrences?

In many cases, we may need to match a word or a group of numbers. A word may consist of several letters, and a group of numbers may consist of several single numbers. Which of the following character or character clusters should we use? Quot; {} "to determine the number of occurrences of the previous content. Assume that x is a number, then {x} indicates" the previous character or character cluster appears only x times "; A number with a comma (,), {x,} indicates that "the preceding content appears x or more times"; two numbers separated by commas (,): {x, y} indicates "the preceding content appears at least x times, but not more than y times ".

Character cluster Description
"^ [A-zA-Z _] $" All letters and underscores
"^ [[: Alpha:] {3} $" All 3-letter words
"^ A $" Letter
"^ A {4} $" A word that does not start with a letter and has four letters, such as Aaaa.
^ A {2, 4} $" Aa, aaa, or aaaa
"^ A {1, 3} $" A, aa or aaa
"^ A {2,} $" Contains more than two a strings, such as aaa, aaaa, and aaaaa.
"^ A {2 ,}" A word that starts with "a", for example, aardvark and aaab. However, apple cannot
"A {2 ,}" Contains two a words, such as baad and aaa, but not Nantucket.
"\ T {2 }" Two tabs
". {2 }" All two characters

We can extend the pattern to more words or numbers:

"^ [A-zA-Z0-9 _] {1,} $" All strings containing more than one letter, number, or underline
"^ [0-9] {1,} $" All positive numbers
"^ \-{0, 1} [0-9] {1,} $" All integers
"^ \-{0, 1} [0-9] {0,} \. {0, 1} [0-9] {0,} $" All integers

In the last example, we can consider: all the headers starting with an optional negative sign (\-{0, 1}) (^), followed by 0 or more numbers ([0-9] {0,}), and an optional decimal point (\. {0, 1}) followed by 0 or multiple numbers ([0-9] {0,}), and nothing else ($ ).

Special character "? "It is equal to" {0, 1} "and both represent:" 0 or 1 previous content "or" previous content is optional ". Therefore:

"^ \-{0, 1} [0-9] {0,} \. {0, 1} [0-9] {0,} $"

It can be simplified:

^ \-? [0-9] {0 ,}\.? [0-9] {0,} $

The special characters "*" and "{0,}" are equal. they all represent "0 or multiple preceding content ". The character "" is equal to {1,}, indicating "1 or more previous content". Therefore, the preceding four examples can be written as follows:

"^ [A-zA-Z0-9 _] $" All strings containing more than one letter, number, or underline
"^ [0-9] $" All positive numbers
"^ \-? [0-9] $" All integers
"^ \-? [0-9] * \.? [0-9] * $" All decimals

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.