PHP commonly used in six processing regular expression functions

Source: Internet
Author: User
Tags ereg ftp html tags lowercase posix expression engine port number

They all take a regular expression as their first argument, listed as follows:
Ereg: The most commonly used regular expression function, Ereg allows us to search for a string that matches a regular expression.
Ereg_replace: Allows us to search for a string that matches a regular expression and replace all occurrences of this expression with a new string.
Eregi: And Ereg almost the same effect, but ignore case.
Eregi_replace: Has the same search-replace functionality as ereg_replace, but ignores case.
Split: Allows us to search for strings that match regular expressions and return matching results as a collection of strings. The
Spliti:split function ignores versions that are case-insensitive.
Why do I use regular expressions?
        If you constantly create different functions to check or manipulate a part of a string, now you may want to discard all of these functions instead of using regular expressions. If you answer "yes" to the following questions, you must consider using regular expressions:
Are you writing custom functions to check form data (such as one @, one point in an email address)?
Do you write custom functions that loop through each character in a string, If this character matches a particular feature (for example, it is uppercase, or it is a space), then replace it?
       In addition to being an uncomfortable method of string checking and manipulation, the above two article will slow down your program if you don't write code efficiently. Are you more inclined to check an email address with the following code:

The code is as follows Copy Code
<?php
function Validateemail ($email)
{
$hasAtSymbol = Strpos ($email, "@");
$hasDot = Strpos ($email, ".");
if ($hasAtSymbol && $hasDot)
return true;
Else
return false;
}
Echo validateemail ("mitchell@devarticles.com");
?>

... or use the following code:

  code is as follows copy code
<?php
function Validateemail ($email)
{
     return ereg ("^[a-za-z]+@[a-za-z]+.[ a-za-z]+$ ", $email);
}
Echo validateemail ("mitchell@devarticles.com");
?

     to be sure, the first function is easier and looks good in structure. But wouldn't it be easier if we used the next version of the email address check function?
     The second function shown above uses only regular expressions, including a call to the Ereg function. The Ereg function returns TRUE or FALSE to declare whether its string argument matches the regular expression.
Many programmers avoid regular expressions because they (in some cases) are slower than other text-processing methods. Regular expressions can be slow because they involve copying and pasting strings in memory because each new part of the regular expression matches a string. However, from my experience with regular expressions, unless you run a complex regular expression in hundreds of lines of text, performance flaws are negligible, and this is rarely the case when the regular expression is used as the input data-checking tool.
Regular expression Syntax
You must first establish a regular expression before you can match a string to a regular expression. At first, the syntax of the regular expression is a bit odd, and each phrase in the expression represents a type of search feature. The following are some of the most common regular expressions, and they all correspond to an example of how to use it:
String Header
searches for the head of a string, with ^, for example,

The code is as follows Copy Code
<?php Echo ereg ("^hello", "Hello world!"); ?>

will return true, but

The code is as follows Copy Code
<?php Echo ereg ("^hello", "I Say hello World");?>

will return false because Hello is not in the head of the string "I say hello World".
String tail
Search the tail of the string, with $, for example:

The code is as follows Copy Code
<?php Echo ereg ("bye$", "Goodbye");?>

will return true, but

The code is as follows Copy Code
<?php Echo ereg ("bye$", "Goodbye My Friend");?>

Returns false because bye is not in the tail of the string "Goodbye my friend."
Any single character
Search for any character, using dots (.), for example:

The code is as follows Copy Code
<?php Echo Ereg (".", "cat");?>

will return true, but

The code is as follows Copy Code
<?php Echo Ereg (".", "");?>

will return false because our search string does not contain characters. You can tell the regular expression engine how many individual characters it will match with curly braces. If I only want to match 5 characters, I can use ereg like this:

The code is as follows Copy Code
<?php Echo Ereg (". { 5}$ "," 12345 ");?>

The above code tells the regular expression engine to return true if and only if at least 5 consecutive characters appear at the end of the string. We can also limit the number of characters that appear consecutively:

The code is as follows Copy Code
<?php Echo ereg ("a{1,3}$", "AAA");?>

In the above example, we have already told the regular expression engine that our search string matches the expression, and that it must have a "a" character in the tail between 1 and 3.

The code is as follows Copy Code
<?php Echo ereg ("a{1,3}$", "Aaab");?>

The above example will not return true, although there are three "a" characters in the search string, but they are not at the tail of the string. If we match the end string to $ from the regular expression, then the string is matched.
We can also tell the regular expression engine to match at least a certain number of word Fu Yai lines, and if they exist, they can match more. We can do this:

The code is as follows Copy Code
<?php Echo ereg ("a{3,}$", "AAAA");?>

0 or more repeated characters
In order to tell the regular expression engine that a character may exist, it can also be repeated, and we use the * character. All two examples here will return true.

The code is as follows Copy Code
<?php Echo ereg ("t*", "Tom");?>
<?php Echo ereg ("t*", "fom");?>

Even though the second example does not contain the character "T", it still returns ture, because * indicates that the character can appear, but it does not have to appear. In fact, any normal string pattern would cause the Ereg call above to return true because the ' t ' character is optional.
One or more repeated characters
In order to tell the regular expression engine that a character must exist, it can be repeated more than once, we use the + character, like

The code is as follows Copy Code
<?php Echo ereg ("z+", "I Like the Zoo");?>

The following example also returns true:

The code is as follows Copy Code
<?php Echo ereg ("z+", "I like the Zzzzzzoo!"); ?>

0 or one repetition character
We can also tell the regular expression engine that a character must be or exist only once, or not. We use a character to do the work, just like

The code is as follows Copy Code
<?php Echo ereg ("C?", "Cats are Fuzzy");?>

If we want, we can completely remove ' C ' from the search string above, and this expression will still return true. '? ' means a ' C ' can appear anywhere in the search string, but not necessarily.
Regular expression Syntax (cont.)
Space characters
To match a space character in a search string, we use the predefined POSIX class, [[: space]]. The square brackets indicate the correlation of consecutive characters, ": Spaces:" is the actual class to match (in this case, any whitespace character). Whitespace includes tab characters, new line characters, and white space characters. Alternatively, if the search string must contain only one space, not a tab or a new line character, you can use an empty characters (""). In most cases, I tend to use ": space:" Because it means that my intent is more than just a single whitespace character, which is easily overlooked. Here are some posix-standard predefined classes,
There are some posix-standard predefined classes that we can use as part of regular expressions, including [: Alnum:], [:d igit:], [: Lower:] and so on. The complete list can be viewed here
We can match a single whitespace character like this:

The code is as follows Copy Code
<?php Echo ereg ("Mitchell[[:space:]]harper", "Mitchell Harper");?>

We can also tell the regular expression engine that there is no white space or a blank space by using a character after an expression.

The code is as follows Copy Code
<?php Echo Ereg ("Mitchell[[:space:]"? Harper "," Mitchellharper ");?>

Pattern grouping
The associated patterns can be divided into square brackets. It is easy to specify only a lowercase letter or a column of uppercase letters with a [a-z] and [A-z] to search for a part of the string.

The code is as follows Copy Code
<?php
Requires a lowercase letter from the first to the last
Echo ereg ("^[a-z]+$", "JohnDoe"); Returns True
?>
or like
<?php
The request is uppercase from the first to the last.
Ereg ("^[a-z]+$", "JOHNDOE"); return true?
?>

We can also tell the regular expression engine that we want either lowercase letters or uppercase letters. We can do this only by combining [A-z] and [a-z] patterns.

The code is as follows Copy Code
<?php Echo ereg ("^[a-za-z]+$", "JohnDoe");?>

In the above example, if we can match "John Doe" rather than "JohnDoe", it would be very meaningful. We do this with the following regular expression:

The code is as follows Copy Code
^[a-za-z]+[[:space:]]{1}[a-za-z]+$
It's easy to search for a numeric string
<?php Echo ereg ("^[0-9]+$", "12345");?>

Word grouping
Not only can the search patterns be grouped, we can also use parentheses to group the relevant search terms.

The code is as follows Copy Code
<?php Echo ereg ("^ john| Jane). +$ "," John Doe ");?>

In the above example, we have a string header character, followed by "John" or "Jane", at least one other character, and then a string trailing character. So...

The code is as follows Copy Code

<?php Echo ereg ("^ john| Jane). +$ "," Jane Doe ");?>

... will also match our search pattern
Case of special characters
Because some characters are used in explicit groupings or grammars of a search pattern, as in (john| Jane), we need to tell the regular expression engine to mask these characters and process them so that they become part of the search string rather than part of the search expression. The method we use is called "character escape", which involves adding any "special symbol" to the backslash. So, for example, if I want to include ' | ' in my search, then I can do it

  code is as follows copy code
<?php Echo Ereg ("^[a-za-z]+|[ a-za-z]+$ "," john| Jane ");

Here are just a few of the characters you want to escape, and you have to escape ^, $, (,),., [, |, *,?, +, and {.
I hope you're feeling a little bit more about how powerful the regular expression is actually. Now let's look at two examples of using regular expressions to examine a string in the data.
Regular Expression Example
Example 1
Let's make the first example quite simple, verify a standard URL. A standard URL (no port number), consisting of three parts:
[protocol]://[domain name]
Let's start with the protocol part that matches the URL. And make it available only with HTTP or FTP. We can do this with the following regular expression:
^ (http|ftp)
^ Word traits classes refers to the head of the string, enclosing HTTP and FTP with parentheses, and using the "or" symbol (|) Separating them, we tell the regular expression engine that either HTTP and FTP must be at the beginning of the string.
A domain name is usually composed of www.111cn.net, but you can choose to have the WWW part optional. For example, we only allow. Com,.net, and. org's domain name is under consideration. We better do that. The domain name portion of the regular expression is represented as follows:
(www.)? +. (com|net|org) $
Put everything together, our regular expression can be used to check a domain name, such as:

  code is as follows copy code
<?php
function Isvaliddomain ($domainName)
{
Return ereg ("^ (HTTP|FTP)://(www.)?" +. (com|net|org) $ ", $domainName);
}
//True (TRUE)
Echo isvaliddomain ("http://www.111cn.net");
//True (TRUE)
Echo isvaliddomain ("ftp:// Www.111cn.net ");
/False (FALSE)
Echo isvaliddomain ("ftp://www.hzhuti.fr");
//False (FALSE)
Echo isvaliddomain ("Www.111cn.net" );

Example two
because I live in Sydney, Australia, let's check out a typical Australian international phone number. The Australian international telephone number is formatted as follows:
+61x xxxx-xxxx
The first x is the area code, and the other is the phone number. Check the phone number that starts with ' +61 ' and immediately follows an area code between 2 and 9, and we use the following regular expression:
^+61[2-9][[:space:]
Note that the search pattern above escapes the ' + ' character, so that it can be included in the search. Not to be interpreted as a regular expression. [2-9] tells the regular expression engine that we need to contain a number between 2 and 9. The [[: Space:]] class tells the regular expression to expect a blank here.
Here is the phone number for the rest of the search mode:
[0-9]{4}-[0-9]{4}$
There's nothing unusual here, we're just telling the regular expression engine phone number available, it must be a combination of 4 digits, followed by a connector, followed by a combination of another 4 digits, and then a string trailing character.
Put the complete regular expression together and put it into a function, and we can check some Australian international phone numbers with code:

The code is as follows Copy Code
<?php
function Isvalidphone ($phoneNum)
{
Echo ereg ("^+61[2-9][[:space:]][0-9]{4}-[0-9]{4}$", $phoneNum);
}
True (True)
Echo Isvalidphone ("+619 0000-0000");
False (False)
Echo Isvalidphone ("+61 00000000");
False (False)
Echo Isvalidphone ("+611 00000000");
?>

Matching regular expressions for Chinese characters: [U4E00-U9FA5]
Commentary: Matching Chinese is really a headache, with this expression will be easy to do
Match Double-byte characters (including Chinese characters): [^x00-xff]
Commentary: can be used to compute the length of a string (a double-byte character length meter 2,ascii 1 characters)
A regular expression that matches a blank row: ns*r
Commentary: can be used to delete blank lines
Regular expression:< matching HTML tags (s*?) [^>]*>.*?| < *? />
Commentary: The online version is too bad, the above can only match the part of the complex nested tags still powerless
A regular expression that matches the end-end whitespace character: ^s*|s*$
Commentary: A useful expression that can be used to delete white-space characters (including spaces, tabs, page breaks, and so on) at the end of a line at the beginning
Regular expression matching an email address: w+ ([-+.] w+) *@w+ ([-.] w+) *.w+ ([-.] w+) *
Commentary: Form validation is useful
Regular expressions that match URL URLs: [a-za-z]+://[^s]*
Commentary: Online circulation of the version of the function is very limited, which can meet the basic requirements
Match account number is legal (beginning of letter, allow 5-16 bytes, allow alphanumeric underline): ^[a-za-z][a-za-z0-9_]$
Commentary: Form validation is useful
Match domestic phone number: d-d|d-d
Commentary: Match form such as 0511-4405222 or 021-87888822
Matching Tencent QQ number: [1-9][0-9]
Commentary: Tencent QQ number starting from 10000
Matching China postal code: [1-9]d (?! D
Commentary: China postal code is 6 digits
Matching ID: d|d
Commentary: China's ID card is 15-or 18-digit
Matching IP address: d+.d+.d+.d+
Commentary: Useful when extracting IP addresses
Match a specific number:
^[1-9]d*$//Matching positive integer
^-[1-9]d*$//matching negative integers
^-? [1-9]d*$//matching integer
^[1-9]d*|0$//matching nonnegative integer (positive integer + 0)
^-[1-9]d*|0$//matching non positive integer (negative integer + 0)
^[1-9]d*.d*|0.d*[1-9]d*$//matching positive floating-point numbers
^-([1-9]d*.d*|0.d*[1-9]d*) $//matching negative floating-point number
^-? ([1-9]d*.d*|0.d*[1-9]d*|0?. 0+|0) $//matching floating-point number
^[1-9]d*.d*|0.d*[1-9]d*|0? 0+|0$//matching nonnegative floating-point number (positive floating-point number + 0)
^ (-([1-9]d*.d*|0.d*[1-9]d*)) |? 0+|0$//matching non-positive floating-point numbers (negative floating-point number + 0)
Commentary: useful when dealing with large amounts of data, pay attention to corrections when applied
Match a specific string:
^[a-za-z]+$//Match a string of 26 English letters
^[a-z]+$//Match a string of 26 uppercase letters
^[a-z]+$//Match string consisting of 26 lowercase letters
^[a-za-z0-9]+$//Match a string of numbers and 26 English letters
^w+$//Match A string of numbers, 26 English letters, or underscores

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.