PHP Regular Expression Introductory article

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Reference Effect Chart

Introduced

Regular expression, we should be often used in the development, now many development languages have regular expression of the application, such as javascript,java,.net,php, and so on, I today to the regular expression of understanding with you Lao Lao, improper place, please advise!

Terms to know--what do you know about the following terms?

Δ delimiter
Δ character Field
Δ modifier
Δ Qualifier
Δ off character
Δ wildcard character (forward check, reverse check)
Δ Reverse Reference
Δ Lazy Match
Δ Comment
Δ 0 Words wide

Positioning

When do we use regular expressions? Not all character operations are good, PHP in some ways, but the effect of efficiency. When we encounter the parsing of complex text data, it is a better choice to use the positive.

Advantages

Regular expressions, when dealing with complex character operations, can improve productivity and save you a certain amount of code.

Disadvantages

When we use regular expressions, complex regular expressions increase the complexity of the code, making it difficult to understand. So we sometimes need to add comments inside the regular expression.

Common mode

¤ Delimiters, usually using "/" as a delimiter to start and end, you can also use "#".
When do you use "#"? Typically there are a lot of "/" characters in your string, because the characters need to be escaped, such as URIs.
The code that uses the "/" delimiter is as follows.

The code is as follows

Copy Code

$regex = '/^http://([w.] +)/([w]+)/([w]+). html$/i ';
$str = ' http://www.youku.com/show_page/id_ABCDEFG.html ';
$matches = Array ();

if (Preg_match ($regex, $str, $matches)) {
Var_dump ($matches);
}

echo "n";

$matches[0] In Preg_match will contain a string that matches the entire pattern.

The code that uses the "#" delimiter is as follows. This time the "/" is not escaped!

The code is as follows

Copy Code

$regex = ' #^http://([w.] +)/([w]+)/([w]+). html$ #i ';
$str = ' http://www.youku.com/show_page/id_ABCDEFG.html ';
$matches = Array ();

if (Preg_match ($regex, $str, $matches)) {
Var_dump ($matches);
}

echo "n";

¤ modifier: Used to change the behavior of a regular expression.

We see ('/^http://[w.] +)/([w]+)/([w]+). html/i ') The last "I" is a modifier, which indicates that the case is ignored, and one of the most common uses of "X" is to ignore spaces.

Contribution code:

The code is as follows

Copy Code

$regex = '/hello/';
$str = ' Hello word ';
$matches = Array ();

if (Preg_match ($regex, $str, $matches)) {
Echo ' No i:valid successful! ', "n";
}

if (Preg_match ($regex. ' I ', $str, $matches)) {
Echo ' YES i:valid successful! ', "n";
}

¤ Character Value field: [W] the part that expands with square brackets is the character field.

¤ Qualifiers: Symbols that are followed by [w]{3,5} or [w]*] or [W] (w]+) represent qualifiers. The specific meaning is introduced.

{3,5} represents 3 to 5 characters. {3,} more than 3 characters, {, 5} up to 5, {3} three characters.

* Represents 0 to many

+ represents 1 to many.

¤ Sign off character

> put in the Word value field (such as: [^w]) to express the negation (not included)--"Reverse selection"

> precede the expression to indicate the start of the current character. (/^n/i, which means beginning with n).

Note that we often call the "jump off character". Used to escape some special symbols, such as ".", "/"

wildcard character (Lookarounds): asserts that certain characters in certain strings exist or not!

Lookarounds are divided into two types: Lookaheads (Forward-check) and lookbehinds (reverse-check? <=).
> Format:
Forward check: (? =) corresponding to (?!) To express a negative meaning
Reverse check: (? <=) corresponding to (? <!) To express a negative meaning
Follow character before and after

The code is as follows

Copy Code

$regex = '/(<=c) d (? =e)/'; /* d front followed by C, D followed by e*/
$str = ' abcdefgk ';
$matches = Array ();

if (Preg_match ($regex, $str, $matches)) {
Var_dump ($matches);
}

echo "n";

Negative meaning:

The code is as follows

Copy Code

$regex = '/(? <!c) d (?! e)/'; /* d not immediately followed by C, D not followed by e*/
$str = ' abcdefgk ';
$matches = Array ();

if (Preg_match ($regex, $str, $matches)) {
Var_dump ($matches);
}

echo "n";

> Character width: 0
Verifying 0 Character Codes

The code is as follows

Copy Code

$regex = '/he (=l) lo/i ';
$str = ' HELLO ';
$matches = Array ();

if (Preg_match ($regex, $str, $matches)) {
Var_dump ($matches);
}

echo "n";

Can't print out the results!

The code is as follows

Copy Code

$regex = '/he (=l) llo/i ';
$str = ' HELLO ';
$matches = Array ();

if (Preg_match ($regex, $str, $matches)) {
Var_dump ($matches);
}

echo "n";

Can print out the results!

Description: (? =l) means that he is followed by an L character. But (? =l) itself does not take up characters, to be distinguished from (l), (l) itself a character.

Capturing data

Groupings that do not have a type specified will be fetched for later use.
> indicates that the type refers to a wildcard character. Therefore, only parentheses starting position without question mark can be captured.

> references within the same expression are called reverse references.
> Calling Format: Number (for example, 1).

The code is as follows

Copy Code

$regex = '/^ (Chuanshanjia) [ws!] +1$/';
$str = ' Chuanshanjia thank Chuanshanjia ';
$matches = Array ();

if (Preg_match ($regex, $str, $matches)) {
Var_dump ($matches);
}

echo "n";

> Avoid capturing data
Format: (?:p Attern)
Advantages: The number of effective reverse references will be kept to a minimum, the code more clear.

> Named Capture Group
Format: (? p< Group name >) calling method (?) p= Group name)

The code is as follows

Copy Code

$regex = '/(? P<author>chuanshanjia) [S]is[s] (? P=author)/I ';
$str = ' Author:chuanshanjia is Chuanshanjia ';
$matches = Array ();

if (Preg_match ($regex, $str, $matches)) {
Var_dump ($matches);
}

echo "n";

Run results

Lazy Match (remember: Two actions will be performed, please see the principle section below)

Format: Qualifier?

Principle: First match "?" The previous section, then the right expression, and the right expression matches the success and the entire match ends.

First look at the following two code:

Code 1.

The code is as follows

Copy Code

$regex = '/(") [^1]+1/i ';
$str = ' "A" "B" "C" "D" ";
$matches = Array ();

if (Preg_match ($regex, $str, $matches)) {
Var_dump ($matches);
}

echo "n";

Results 1.

Code 2

The code is as follows

Copy Code

$regex = '/(") [^1]+?1/i ';
$str = ' "A" "B" "C" "D" ";
$matches = Array ();

if (Preg_match ($regex, $str, $matches)) {
Var_dump ($matches);
}

echo "n";

Result 2

Analysis:
      compare two regular expressions: the first one adds "?" and the second one doesn't.
      results: Mainly look at the first parameter: the first one to print out all the characters, the second one only printed a "" a "".
      Conclusion:
           >> First Meet (") [^1]+1 conditions have the
                     "A", "a" "B", "a" "B" "C", "a" "B" "C" "D",   "B", "B" "C", "B" "C" "D",     "C", "C" "D",     "D"
                   The first regular expression selects the largest "a" "B" "C" "D", indicating that a non-inert match will compare the maximum matching result.

            >> second regular expression: first match (") [^1]+, if the match succeeds, then we are matching"? "1 on the right, if the match succeeds, the entire match ends."

Other cases:

The code is as follows	Copy Code
"Oh," my "God" =====>/(")" ([^1] \| \1) *? (? <!\) 1/i

Comments for regular expressions

Format: (? # comment Content)
Purpose: Mainly used in complex annotations

Contribution code: is a regular expression used to connect to the MySQL database

The code is as follows

Copy Code

$regex = '/
^host= (? <!.) ([D.] +)(?!.) (? #主机地址)
|
([w!@#$%^&* () _+-]+) (? #用户名)
|
([w!@#$%^&* () _+-]+) (? #密码)
(?!|) $/ix ';

$str = ' host=192.168.10.221|root|123456 ';
$matches = Array ();

if (Preg_match ($regex, $str, $matches)) {
Var_dump ($matches);
}

echo "n";

Special characters	Explain
*	0 to many times
+	1 to several times can also be written as {1}
?	0 or 1 times
.	Matches all individual characters except for line breaks
W	[A-za-z0-9_]
S	White space characters (spaces, line breaks, carriage returns) [TNR]
D	[0-9]

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

PHP Regular Expression Introductory article

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

PHP Regular Expression Introductory article

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support