Comparison between POSIX and Perl-Compatible Regular Expressions (preg_match, preg_replace, ereg, ereg_replace)

Source: Internet
Author: User
Tags ereg html to text posix

First, let's take a look at the two main functions of POSIX-style Regular Expressions:

Ereg function: (Regular Expression matching)

Format: int ereg (string pattern, string [, array & regs])
Note: The preg_match () function using Perl Compatible with regular expression syntax is usually a faster alternative than ereg. (Generally, preg_match () is used, which is better ~~)

Search for substrings matching the given regular expression pattern in string in case-sensitive mode. If you find a substring that matches the child pattern in the parentheses in pattern and the function call provides the third parameter regs, the matched substring is saved to the regs array. $ Regs [1] contains the substring starting with the first left parentheses, $ regs [2] contains the second substring, and so on. $ Regs [0] contains the entire matched string.

Return Value: if the pattern matching is found in the string, the length of the matched string is returned. If no matching is found or an error occurs, false is returned. If the optional parameter regs is not passed or the matched string length is 0, this function returns 1.

Let's take a look at the example of the ereg () function:

BelowCodeThe segment accepts date (YYYY-MM-DD) in ISO format and then displays in dd. mm. yyyy format:Copy codeThe Code is as follows: <? PHP
If (ereg ("([0-9] {4})-([0-9] {1, 2})-([0-9] {1, 2 })", $ date, $ regs )){
Echo "$ regs [3]. $ regs [2]. $ regs [1]";
} Else {
Echo "invalid Date Format: $ date ";
}
?>

Bytes -----------------------------------------------------------------------------------
Ereg_replace function: (Regular Expression replacement)

Format: String ereg_replace (string pattern, string replacement, string)
Function Description:
This function scans the matching part of pattern in string, replace it with replacement.
return the replaced string. (If there is no matching item to be replaced, the original string is returned .)
If pattern contains a substring in parentheses, replacement can contain substrings in the form of \ digit, these substrings are replaced with the substrings in parentheses represented by numbers; \ 0 contains the entire content of the string. A maximum of nine substrings can be used. Parentheses can be nested. In this case, the left parentheses are used to calculate the order.
if no match is found in the string, the string is returned as is.
let's take a look at this function example:
1. The following code snippet outputs "This was a test" three times: copy Code the code is as follows: $ string = "this is a test";
echo str_replace ("is", "was", $ string );
echo ereg_replace ("() is", "\ 1was", $ string);
echo ereg_replace ("() is )", "\ 2was", $ string);
?>

Note that if an integer is used in the replacement parameter, the expected result may not be obtained. This is because ereg_replace () interprets and applies numbers as the sequence values of characters. For example:
2. Replacement:Copy codeThe Code is as follows: <? PHP
/* Expected results cannot be generated */
$ Num = 4;
$ String = "this string has four words .";
$ String = ereg_replace ('four ', $ num, $ string );
Echo $ string;/* output: 'This string has words .'*/
/* This example works normally */
$ Num = '4 ';
$ String = "this string has four words .";
$ String = ereg_replace ('four ', $ num, $ string );
Echo $ string;/* output: 'This string has 4 words .'*/
?>

3. Replace the URL with a hyperlink:Copy codeThe Code is as follows: $ text = ereg_replace ("[[: Alpha:] +: // [^ <> [: Space:] + [[: alnum:] /] ",
"<A href = \" \ 0 \ ">\\ 0 </a>", $ text );

Tip: The preg_replace () function uses Perl-Compatible Regular Expression syntax, which is usually a faster alternative than ereg_replace.
Let's take a look at the two main functions compatible with regular expressions in Perl:
Preg_match function: (Regular Expression matching)
Format: int preg_match (string pattern, string subject [, array matches [, int flags])
Function Description:
Search for the content that matches the regular expression given by pattern in the subject string.
If matches is provided, it is filled with the search results. $ Matches [0] will contain the text that matches the entire pattern, $ matches [1] will contain the text that matches the child pattern in the first captured bracket, and so on.
Flags can be the following mark:
Preg_offset_capture
If this flag is set, the offset of the affiliated string is also returned for each matching result. Note that this changes the value of the returned array, so that each unit is also an array. The first item is the matching string, and the second item is its offset. This tag is available from PhP 4.3.0.
The flags parameter is available since PHP 4.3.0.
Preg_match () returns the number of times pattern matches. Either 0 (no match) or 1 time, because preg_match () will stop searching after the first match. Preg_match_all () indicates that, on the contrary, the end of the subject is always searched. If an error occurs in preg_match (), false is returned.
TIPS: If you only want to check whether a string is contained in another string, do not use preg_match (). It can be replaced by strpos () or strstr (), which is much faster.
Let's take a look at its example:
Example 1. Search for "php" in the text ": Copy code The Code is as follows: <? PHP
// The "I" after the pattern delimiter indicates a search that is case-insensitive.
If (preg_match ("/PHP/I", "PHP is the Web scripting language of choice .")){
Print "A match was found .";
} Else {
Print "A match was not found .";
}
?>

Example 2. Search for the word "Web ":Copy codeThe Code is as follows: <? PHP
In/* mode, \ B indicates the word boundary. Therefore, only independent "Web" words are matched,
* Does not match a part of "webbing" or "cobweb */
If (preg_match ("/\ bweb \ B/I", "PHP is the Web scripting language of choice .")){
Print "A match was found .";
} Else {
Print "A match was not found .";
}
If (preg_match ("/\ bweb \ B/I", "PHP is the website scripting language of choice .")){
Print "A match was found .";
} Else {
Print "A match was not found .";
}
?>

Example 3. Retrieve the domain name from the URL:Copy codeThe Code is as follows: <? PHP
// Obtain the host name from the URL
Preg_match ("/^ (http :\/\/)? ([^ \/] +)/I ",
Http://www.php.net/index.html, $ matches );
$ Host = $ matches [2];
// Obtain the following two segments from the host name
Preg_match ("/[^ \. \/] + \. [^ \. \/] + $/", $ host, $ matches );
Echo "domain name is: {$ matches [0]} \ n ";
?>

This example will output:
Domain name is: php.net
Bytes -----------------------------------------------------------------------------------
Preg_replace function: (search and replace regular expressions)
Format: Mixed preg_replace (mixed pattern, mixed replacement, Mixed Subject [, int limit])
Function Description:
Search for matches in pattern in subject and replace them with replacement. If limit is specified, only limit matches are replaced. If limit is omitted or its value is-1, all matches are replaced.
Replacement can contain \ n form or (from PhP 4.0.4) $ N form of reverse reference. The latter is preferred. Each such reference will be replaced with the text that matches the child pattern in the nth captured parentheses. N can be from 0 to 99, where \ 0 or $0 indicates the text matched by the entire mode. Count left parentheses from left to right (starting from 1) to obtain the number of child modes.
When the replacement mode is followed by a number after a reverse reference (that is, the number next to a matching mode), the familiar \ 1 Symbol cannot be used to represent a reverse reference. For example, \ 11 will make preg_replace () confused that the reverse reference of \ 1 is followed by a number 1 or a reverse reference of \ 11. In this example, \ $ {1} 1 is used. This will form an isolated $1 reverse reference, and make the other 1 just plain text.
Let's take a look at its example:
Example 1. Reverse reference followed by numbers: Copy code The Code is as follows: <? PHP
$ String = "April 15,200 3 ";
$ Pattern = "/(\ W +) (\ D +), (\ D +)/I ";
$ Replacement = "\$ {1} 1, \ $3 ";
Print preg_replace ($ pattern, $ replacement, $ string );
/* Output
======
April1, 2003
*/
?>

If a match is found, the replaced subject is returned. Otherwise, the original unchanged subject is returned.
Each parameter (except limit) of preg_replace () can be an array. If both pattern and replacement are arrays, they are processed in the order in which their key names appear in the array. This is not necessarily the same as the numerical order of the index. If indexes are used to identify which pattern will be replaced by which replacement, ksort () should be used to sort the array before preg_replace () is called.
Example 2. Use the Index Array in preg_replace:Copy code The Code is as follows: <? PHP
$ String = "The quick brown fox jumped over the lazy dog .";
$ Patterns [0] = "/quick /";
$ Patterns [1] = "/brown /";
$ Patterns [2] = "/Fox /";
$ Replacements [2] = "bear ";
$ Replacements [1] = "black ";
$ Replacements [0] = "slow ";
Print preg_replace ($ patterns, $ replacements, $ string );
/* Output
======
The bear black slow jumped over the lazy dog.
*/
/* By ksorting patterns and replacements,
We shoshould get what we wanted .*/
Ksort ($ patterns );
Ksort ($ replacements );
Print preg_replace ($ patterns, $ replacements, $ string );
/* Output
======
The slow black bear jumped over the lazy dog.
*/
?>

if the subject is an array, it searches and replaces each item in the subject and returns an array.
if both pattern and replacement are arrays, preg_replace () extracts values from them to search and replace subject. If the value in replacement is less than that in pattern, an empty string is used as the remaining replacement value. If pattern is an array and replacement is a string, this string is used as the replacement value for each value in pattern. In turn, it makes no sense.
The/e modifier enables preg_replace () to treat the replacement parameter as PHP code (after appropriate reverse references are replaced ). Tip: Make sure that the replacement constitutes a valid PHP code string. Otherwise, PHP will report a syntax parsing error in the row containing preg_replace.
Example 3. replace several values: copy Code the code is as follows: $ patterns = array ("/(19 | 20) (\ D {2})-(\ D {1, 2})-(\ D {1, 2 }) /",
"/^ \ s * {(\ W +)} \ s * = /");
$ replace = array ("\ 3/\ 4/\ 1 \ 2", "$ \ 1 = ");
Print preg_replace ($ patterns, $ replace, "{startdate }=");
?>

output in this example:
$ startdate = 5/27/1999
Example 4. use the/e modifier: copy Code the code is as follows: preg_replace ("/(<\/?) (\ W +) ([^>] *>)/E ",
" '\ 1 '. strtoupper ('\ 2 '). '\ 3' ",
$ html_body);
?>

This converts all HTML tags in the input string to uppercase.
Example 5. convert HTML to text:Copy codeThe Code is as follows: <? PHP
// $ Document should contain an HTML document.
// In this example, the HTML Tag and JavaScript code are removed.
// And blank characters. Some common
// Convert the HTML object to the corresponding text.
$ Search = array ("'<SCRIPT [^>] *?>. *? </SCRIPT> 'si ", // remove Javascript
"'<[\/\!] *? [^ <>] *?> 'Si ", // remove the HTML Tag
"'([\ R \ n]) [\ s] +'", // remove the white space
"'& (Quot | #34);' I", // replaces the HTML Object
"'& (Amp | #38);' I ",
"'& (LT | #60);' I ",
"'& (GT | #62);' I ",
"'& (Nbsp | #160);' I ",
"'& (Iexcl | #161);' I ",
"'& (Cent | #162);' I ",
"'& (Pound | #163);' I ",
"'& (Copy | #169);' I ",
"'(\ D +); 'E"); // run as PHP code
$ Replace = array ("",
"",
"\ 1 ",
"\"",
"&",
"<",
"> ",
"",
CHR (1, 161 ),
CHR (1, 162 ),
CHR (1, 163 ),
CHR (1, 169 ),
"CHR (\ 1 )");
$ Text = preg_replace ($ search, $ replace, $ document );
?>

The end...

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.