Mind Mapping
introduce
Regular expression, we should be often used in the development, now many development languages have regular expression of the application, such as javascript,java,.net,php, and so on, I today to the regular expression of understanding with you Lao Lao, improper place, please advise!
Terms to know--what do you know about the following terms?
Delimiters, character value field, modifiers, qualifiers, caret, wildcard characters (forward lookup, reverse lookup), reverse reference, lazy match, comment, 0 characters wide
positioning
When are we going to use regular expressions? Not all character operations are good, and PHP has a positive effect on some aspects of efficiency. When we encounter the parsing of complex text data, it is a better choice to use the positive.
Advantages
Regular expressions, when dealing with complex character operations, can improve productivity and save you a certain amount of code.
Disadvantage
When we use regular expressions, complex regular expressions increase the complexity of the code, making it difficult to understand. So we sometimes need to add comments inside the regular expression.
Common Mode
Delimiters, usually using "/" as a delimiter to start and end, or "#" can be used.
When do you use "#"? Typically there are a lot of "/" characters in your string, because the characters need to be escaped, such as URIs.
The code that uses the "/" delimiter is as follows.
$regex = '/^http:\/\/([\w.] +) \/([\w]+) \/([\w]+) \.html$/i ';
$str = ' http://www.youku.com/show_page/id_ABCDEFG.html ';
$matches = Array ();
if (Preg_match ($regex, $str, $matches)) {
Var_dump ($matches);
}
echo "\ n";
$matches[0] In Preg_match will contain a string that matches the entire pattern.
The code that uses the "#" delimiter is as follows. This time the "/" is not escaped!
$regex = ' #^http://([\w.] +)/([\w]+)/([\w]+) \.html$ #i ';
$str = ' http://www.youku.com/show_page/id_ABCDEFG.html ';
$matches = Array ();
if (Preg_match ($regex, $str, $matches)) {
Var_dump ($matches);
}
echo "\ n";
Modifier: Used to change the behavior of a regular expression.
We see ('/^http:\/\/[\w.] +) The last "I" in \/([\w]+) \/([\w]+) \.html/i ') is a modifier that ignores case, and one of the most common uses of "X" is to ignore spaces.
Contribution code:
$regex = '/hello/';
$str = ' Hello word ';
$matches = Array ();
if (Preg_match ($regex, $str, $matches)) {
Echo ' No i:valid successful! ', ' \ n ';
}
if (Preg_match ($regex. ' I ', $str, $matches)) {
Echo ' YES i:valid successful! ', "\ n";
}
Character Value field: [\w] The part that is enlarged with square brackets is the field of characters.
Qualifiers: Symbols that are followed by [\w]{3,5} or [\w]*] or [\w] represent qualifiers. The specific meaning is introduced.
{3,5} represents 3 to 5 characters. {3,} more than 3 characters, {, 5} up to 5, {3} three characters. * Represents 0 to multiple, + 1 to multiple.
Off-character symbol
Put in the character value field (eg: [^\w]) to indicate negation (not included)--"Reverse selection"
Precede the expression to begin with the current character. (/^n/i, which means beginning with n).
Note that "\" is called "Jump off character". Used to escape some special symbols, such as ".", "/"
wildcard character (Lookarounds): asserts that certain characters in certain strings exist or not!
Lookarounds are divided into two types: Lookaheads (Forward-check) and lookbehinds (reverse-check? <=).
Format:
Forward check: (? =) corresponding to (?!) To have a negative meaning; reverse-check: (? <=) corresponding to (?
Follow character before and after
$regex = '/(<=c) d (? =e)/'; /* d front followed by C, D followed by e*/
$str = ' abcdefgk ';
$matches = Array ();
if (Preg_match ($regex, $str, $matches)) {
Var_dump ($matches);
}
echo "\ n";
Negative meaning:
$regex = '/(? <!c) d (?! e)/';
$str = ' abcdefgk ';
$matches = Array ();
if (Preg_match ($regex, $str, $matches)) {
Var_dump ($matches);
}
echo "\ n";
Character width: 0
Verify 0 characters Fudai
$regex = '/he (=l) lo/i ';
$str = ' HELLO ';
$matches = Array ();
if (Preg_match ($regex, $str, $matches)) {
Var_dump ($matches);
}
echo "\ n";
Can't print out the results!
$regex = '/he (=l) llo/i ';
$str = ' HELLO ';
$matches = Array ();
if (Preg_match ($regex, $str, $matches)) {
Var_dump ($matches);
}
echo "\ n";
Can print out the results!
Description: (? =l) means that he is followed by an L character. But (? =l) itself does not take up characters, to be distinguished from (l), (l) itself a character.
Capturing Data
Groupings that do not have a type specified will be fetched for later use.
Indicates that the type refers to a wildcard character. Therefore, only parentheses starting position without question mark can be captured.
A reference within the same expression is called a reverse reference.
Call format: \ number (such as \1).
$regex = '/^ (Chuanshanjia) [\w\s!] +\1$/';
$str = ' Chuanshanjia thank Chuanshanjia ';
$matches = Array ();
if (Preg_match ($regex, $str, $matches)) {
Var_dump ($matches);
}
echo "\ n";
Avoid capturing data
Format: (?:p Attern)
Advantages: The number of effective reverse references will be kept to a minimum, the code more clear.
Named Capture Group
Format: (? p< Group name >) calling method (?) p= Group name)
$regex = '/(? P Chuanshanjia) [\s]is[\s] (? P=author)/I ';
$str = ' Author:chuanshanjia is Chuanshanjia ';
$matches = Array ();
if (Preg_match ($regex, $str, $matches)) {
Var_dump ($matches);
}
echo "\ n";
Run results
Lazy Match (remember: Two actions will be performed, please see the principle section below)
Format: Qualifier?
Principle: "?" : The smallest data is used if there is a qualifier before it. such as "*" will take 0, and "+" will take 1, if is {3,5} will take 3.
First look at the following two code:
Code 1.
$regex = '/hel*/i ';
$str = ' hellllllllllllllll ';
if (Preg_match ($regex, $str, $matches)) {
Var_dump ($matches);
}
echo "\ n";
Results 1.
Code 2
$regex = '/hel*?/i ';
$str = ' hellllllllllllllll ';
if (Preg_match ($regex, $str, $matches)) {
Var_dump ($matches);
}
echo "\ n";
Result 2
Code 3, using the "+"
$regex = '/hel+?/i ';
$str = ' hellllllllllllllll ';
if (Preg_match ($regex, $str, $matches)) {
Var_dump ($matches);
}
echo "\ n";
Result 3
Code 4, using the {3,5}
$regex = '/hel{3,10}?/i ';
$str = ' hellllllllllllllll ';
if (Preg_match ($regex, $str, $matches)) {
Var_dump ($matches);
}
echo "\ n";
Result 4
comments for regular Expressions
Format: (? # comment Content)
Purpose: Mainly used in complex annotations
Contribution code: is a regular expression used to connect to the MySQL database
$regex = '/
^host= (? <!\.) ([\d.] +)(?! \.) (? #主机地址)
\
([\w!@#$%^&* () _+\-]+) (? #用户名)
\
([\w!@#$%^&* () _+\-]+) (? #密码)
(?! \) $/ix ';
$str = ' host=192.168.10.221root123456 ';
$matches = Array ();
if (Preg_match ($regex, $str, $matches)) {
Var_dump ($matches);
}
echo "\ n";
Original link: http://www.cnblogs.com/baochuan/archive/2012/03/12/2391135.html