Page 1/2, one of the PHP and regular expression tutorials

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

PHP and Regular Expressions
A regular expression is a specific format that can be used to identify the usage of a string in another string. Several programming languages, including Visual Basic, Perl, JavaScript, and PHP, support regular expressions. I hope that at the end of this tutorial, Mitchell (by myself) you can apply some basic regular expressions in the PHP program. Regular Expressions are one of the odd features highlighted in a variety of programming languages, but since they seem to be a difficult concept, many developers put them in the corner, forget their existence.
Let's take a look at what regular expressions are and why you should use them in PHP programs.
What is a regular expression?
What do you think of separating a program like BBEdit and notepad from a good old control-based text editor? Both support text input, allowing you to save text to a file, but the current text editor also supports other features, including the search-replace tool, which makes it quite easy to edit a text file.
Regular Expressions are similar, but better. Regular Expressions can be considered as an extremely advanced search-replacement tool, which frees us from the pain: you do not have to write custom data validation examples to check the email address or confirm that the phone number format is correct.
One of the most common functions in any program is the Data Validity Check. PHP binds some text check functions, allowing us to use regular expressions to match a string, to confirm that there is a space, there is a question mark, and so on.
What you don't know is, can regular expressions be simply equipped, when you have mastered some regular expressions (this regular expression can be used to tell the Regular Expression Engine what we want to match in a string ), you may ask why the regular expression has been thrown into the corner for so long, ^ _ ^.
PHP has two sets of functions for processing two types of Regular Expressions: Perl5 compatibility mode and Posix standard compatibility mode. In this article, we will take a look at the ereg function and use a search expression following the Posix standard. Although they are not as powerful as Perl5, they are a good way to learn regular expressions. If you are interested in the Perl5 Compatible Regular Expressions supported by PHP, go to the PHP.net website to find some details about the preg function.
PHP has six functions to process regular expressions. They all take a regular expression as their first parameter and list it as follows:
Ereg: The most common Regular Expression Function. ereg allows us to search for a string matching a regular expression.
Ereg_replace: allows us to search for a string that matches the Regular Expression and use a new string to replace all the places where this expression appears.
Eregi: the effect is almost the same as that of ereg, but the case sensitivity is ignored.
Eregi_replace: it has the same search-replace function as ereg_replace, but case-insensitive.
Split: allows us to search for strings matching Regular Expressions and return matching results in the form of string sets.
Spliti: The split function ignores case-insensitive versions.
Why use a regular expression?
If you constantly create different functions to check or manipulate part of a string, you may have to discard all these functions and replace them with regular expressions. If you answer "yes" to any of the following questions, you must consider using a regular expression:
Are you writing some custom functions to check form data (for example, one @ and one point in the email address )?
Do you want to write custom functions to cycle every character in a string? If this character matches a specific feature (for example, it is capitalized or it is a space ), then replace it?
In addition to the uncomfortable string check and operation methods, if you do not write code efficiently, the above two will also slow your program down. Do you prefer to use the following code to check an email address:
<? Php
Function validateEmail ($ email)
{
$ HasAtSymbol = strpos ($ email ,"@");
$ HasDot = strpos ($ email ,".");
If ($ hasAtSymbol & $ hasDot)
Return true;
Else
Return false;
}
Echo validateEmail ("mitchell@devarticles.com ");
?>
...
Or use the following code:
<? Php
Function validateEmail ($ email)
{
Return ereg ("^ [a-zA-Z] + @ [a-zA-Z] +. [a-zA-Z] + $", $ email );
}
Echo validateEmail ("mitchell@devarticles.com ");
?>
Certainly, the first function is relatively easy and looks good in structure. But isn't it easier to use the email address check function of the next version?
The second function shown above only uses regular expressions, including a call to the ereg function. The Ereg function returns true or false to declare whether its string parameter matches the regular expression.
Many programmers avoid Regular Expressions because they (in some cases) are slower than other text processing methods. Regular expressions may be slow because they involve copying and pasting strings in memory, because each new part of the regular expression matches a string. However, in my experience with regular expressions, unless you run a complex Regular Expression on hundreds of lines in the text, the performance defects are negligible, this is also rare when you use a regular expression as the input data check tool.
Regular expression syntax
Before you can match a string to a regular expression, you must first create a regular expression. At the beginning, the regular expression syntax was a bit odd. Each phrase in the expression represents a type of search feature. The following are some of the most common regular expressions that correspond to an example of how to use them:
String Header
Search for the header of a string using ^, such
<? Php echo ereg ("^ hello", "hello world! ");?>
Returns true,
<? Php echo ereg ("^ hello", "I say hello world");?>
False is returned because hello is not in the header of the string "I say hello world.
End of string
Search the end of a string with $, for example:
<? Php echo ereg ("bye $", "goodbye");?>
Returns true,
<? Php echo ereg ("bye $", "goodbye my friend");?>
Returns false because bye is not at the end of the string "goodbye my friend.
Any single character
Search for any character and use a vertex (.), for example:
<? Php echo ereg (".", "cat");?>
Returns true,
<? Php echo ereg (".", "") ;?>
Returns false because the string to be searched does not contain any character. You can use curly brackets to tell the Regular Expression Engine how many individual characters it will match. If I only want to match five characters, I can use ereg as follows:
<? Php echo ereg (". {5 }$", "12345");?>
The code above tells the Regular Expression Engine to return true if at least five consecutive characters appear at the end of the string. We can also limit the number of consecutive characters:
<? Php echo ereg ("a {1, 3} $", "aaa");?>
In the above example, we have told the Regular Expression Engine that our search string matches the expression. It must have between 1 and 3 "a" characters at the end.
<? Php echo ereg ("a {1, 3} $", "aaab");?>
The above example will not return true. Although there are three "a" characters in the search string, they are not at the end of the string. If we remove the ending string matching $ from the regular expression, this string is matched.
We can also tell the Regular Expression Engine to match at least a certain number of characters in a row. If they exist, they can match more. We can do this:
<? Php echo ereg ("a {3, }$", "aaaa");?>
Zero or multiple repeated characters
To tell the Regular Expression Engine that a character may exist or can be repeated, we use the * character. Both examples here return true.
<? Php echo ereg ("t *", "tom");?>
<? Php echo ereg ("t *", "fom");?>
Even if the second example does not contain the character "t", the system returns true because * indicates that the character can appear but is not required. In fact, any common string mode will make the preceding ereg call return true, because the 'T' character is optional.
One or more repeated characters
To tell the Regular Expression Engine that a character must exist or can be repeated more than once, we use + characters, such
<? Php echo ereg ("z +", "I like the zoo");?>
The following example also returns true:
<? Php echo ereg ("z +", "I like the zzzzzzzzoo! ");?>
Zero or duplicate characters
We can also tell the Regular Expression Engine that a character must exist only once or not. We use? Character to do this job, just like
<? Php echo ereg ("c? "," Cats are fuzzy ");?>
If we want to, we can completely delete 'c' from the search string above. This expression will still return true .'? 'Indicates that a 'C' can appear anywhere in the search string, but it is not required.
Regular expression syntax (continued)
Space Character
To match spaces in a search string, we use the pre-defined Posix class [[: space]. square brackets indicate the relevance of consecutive characters. ": space:" indicates the class to be matched (in this case, it is any blank character ). White spaces include tab characters, new line characters, and blank characters. Alternatively, if the search string must contain only one space, rather than a tab or a new line character, you can use a space character (""). In most cases, I tend to use ": space:", because it means that my intention is not just a single space character, which is easily ignored. Here are some Posix-standard pre-defined classes,
There are some Posix-standard pre-defined classes that can be used as regular expressions, including [: alnum:], [: digit:], [: lower:], and so on. The complete list can be viewed here
We can match a single blank character like this:
<? Php echo ereg ("Mitchell [[: space:] Harper", "Mitchell Harper");?>
We can also use? To tell the Regular Expression Engine that there is no blank or blank matching.
<? Php echo ereg ("Mitchell [[: space:]? Harper "," MitchellHarper ");?>
Mode group
The related patterns can be separated in square brackets. It is easy to use [a-z] and [A-Z] to specify that only one lower-case letter or one column of upper-case letters exist in part of the search string.
<? Php
// Lowercase letters are required from the first to the last
Echo ereg ("^ [a-z] + $", "johndoe"); // return true
?>
Or Image
<? Php
// The first and last are uppercase letters.
Ereg ("^ [A-Z] + $", "JOHNDOE"); // returns true?
?>
We can also tell the Regular Expression Engine that we want to use either lower-case letters or upper-case letters. We just need to combine the [a-z] and [A-Z] patterns.
<? Php echo ereg ("^ [a-zA-Z] + $", "JohnDoe");?>
In the above example, if we can match "John Doe" instead of "JohnDoe", it will be very meaningful. We use the following regular expression to do this:
^ [A-zA-Z] + [[: space:] {1} [a-zA-Z] + $
It is easy to search for a numeric string
<? Php echo ereg ("^ [0-9] + $", "12345");?>
Word grouping
Not only can the search mode be grouped, but we can also use parentheses to group related search words.
<? Php echo ereg ("^ (John | Jane). + $", "John Doe");?>
In the above example, we have a string header character followed by "John" or "Jane", at least one other character, and then a string tail character. So...
<? Php echo ereg ("^ (John | Jane). + $", "Jane Doe");?>
... Will also match our search mode
Special characters
Because some characters are used in a clear grouping or syntax in the search mode, such as parentheses in (John | Jane), we need to tell the Regular Expression Engine to block these characters, process them to make them part of the searched string, not part of the search expression. The method we use is called "character escape", which involves adding a backslash to any "special symbol. So, for example, if I want to include '|' in my search, I can do this.
<? Php echo ereg ("^ [a-zA-z] + | [a-zA-z] + $", "John | Jane");?>
Here are only a few characters you want to escape. You must escape ^, $, (,),., [, | ,*,?, +, And {.
I hope you will feel a little bit more powerful about regular expressions. Now let's look at two examples of checking a string in data using a regular expression.
Regular Expression example
Example 1
Let's make the first example quite simple. Check a standard URL. A standard URL (without a port number) consists of three parts:
[Protocol]: // [domain name]
Let's start from the Protocol part that matches the URL and make it only use http or ftp. We can use the following regular expression to do this:
^ (Http | ftp)
^ Character refers to the header of a string. It enclose http and ftp with parentheses and separate them with the "or" symbol (|, we tell the Regular Expression Engine that either http or ftp must start with a string.
A domain name is usually composed of www.somesite.com, but you can select the www part at will. For the sake of simplicity, we only allow. com,. net, and. org domain names to be considered. We 'd better represent the domain name in the regular expression as follows:
(Www .)?. +. (Com | net | org) $
When we put everything together, our regular expression can be used to check a domain name, such:
<? Php
Function isValidDomain ($ domainName)
{
Return ereg ("^ (http | ftp): // (www .)?. +. (Com | net | org) $ ", $ domainName );
}
// True (true)
Echo isValidDomain ("http://www.somesite.com ");
// True (true)
Echo isValidDomain ("ftp://somesite.com ");
// False)
Echo isValidDomain ("ftp://www.somesite.fr ");
// False)
Echo isValidDomain ("www.somesite.com ");
?>
Example 2
Because I live in Sydney, Australia, let's check a typical Australian international phone number. The format of the Australian international phone number is as follows:
+ 61x xxxx-xxxx
The first x is the area code, and the others are telephone numbers. Check the phone number that starts with '+ 61' and follows a zone number between 2 and 9. We use the following regular expression:
^ + 61 [2-9] [[: space:]
Note: In the search mode above, the '+' character is escaped with '', so that it can be included in the search without being interpreted as a regular expression. [2-9] tells the Regular Expression Engine that we need to include a number between 2 and 9. [[: Space:] indicates that the regular expression is expected to have a blank space.
Here is the remaining search mode for phone numbers:
[0-9] {4}-[0-9] {4} $
There is nothing unusual here. We just tell the Regular Expression Engine the number available for the phone number. It must be a combination of four numbers followed by a connector, followed by a combination of four other numbers, and then a string's tail character.
Put the complete regular expression together and put it into a function. We can use the code to check some Australian international phone numbers:
<? Php
Function isValidPhone ($ phoneNum)
{
Echo ereg ("^ + 61 [2-9] [[: space:] [0-9] {4}-[0-9] {4} $ ", $ phoneNum );
}
// True (true)
Echo isValidPhone ("+ 619 0000-0000 ");
// False)
Echo isValidPhone ("+ 60 00000000 ");
// False)
Echo isValidPhone ("+ 611 00000000 ");
?>
Summary
Regular Expressions use code that is not suitable for writing and repeating to check a string. In the last few pages, we have explained the basics of all Posix standard regular expressions, including characters, grouping and PHP ereg functions. We also know how to use regular expressions to check some simple strings in PHP.
Note: I am not very good at English, and may be in different places. The character classes in this article are actually what we call character clusters.
Classic Regular Expression
Regular expressions are used for string processing, form verification, and other occasions. They are practical and efficient, but they are always not sure when used, so they often need to be checked online. I will add some frequently-used expressions to my favorites for memo. This post is updated at any time.
Regular Expression matching Chinese characters: [u4e00-u9fa5]
Match double byte characters (including Chinese characters): [^ x00-xff]
Application: Calculate the length of a string (two-byte length Meter 2, ASCII character meter 1)
String. prototype. len = function () {return this. replace ([^ x00-xff]/g, "aa"). length ;}
Regular Expression for matching empty rows: n [s |] * r
Regular Expressions matching HTML tags:/<(. *)>. * </1> | <(. *)/>/
Regular Expression matching the first and last spaces: (^ s *) | (s * $)
Application: javascript does not have trim functions like vbscript. We can use this expression to implement it, as shown below:
String. prototype. trim = function ()
{
Return this. replace (/(^ s *) | (s * $)/g ,"");
}
Use regular expressions to break down and convert IP addresses:
The following is a Javascript program that uses regular expressions to match IP addresses and convert IP addresses to corresponding values:
Function IP2V (ip)
{
Re =/(d +). (d +)/g // Regular Expression matching IP addresses
If (re. test (ip ))
{
Return RegExp. $1 * Math. pow (255) + RegExp. $2 * Math. pow () + RegExp. $3 * + RegExp. $4*1
}
Else
{
Throw new Error ("Not a valid IP address! ")
}
}
However, if the above program does not use regular expressions, it may be easier to directly use the split function to separate them. The program is as follows:
Var ip = "10.100.0000168"
Ip = ip. split (".")
Alert ("the IP value is: "+ (ip [0] * 255*255*255 + ip [1] * 255*255 + ip [2] * 255 + ip [3] * 1 ))
Regular Expression matching the Email address: w + ([-+.] w +) * @ w + ([-.] w + )*. w + ([-.] w + )*
The regular expression matching the URL: http: // ([w-] +.) + [w-] + (/[w -./? % & =] *)?
Algorithm program that uses regular expressions to remove repeated characters in a string:
Var s = "abacabefgeeii"
Var s1 = s. replace (/(.). * 1/g, "$1 ")
Var re = new RegExp ("[" + s1 + "]", "g ")
Var s2 = s. replace (re ,"")
Alert (s1 + s2) // The result is: abcefgi
I used to post on CSDN to seek an expression to remove repeated characters, but I couldn't find it. This is the simplest implementation method I can think. The idea is to use the back-to-back reference to retrieve repeated characters, then create a second expression with repeated characters, get non-repeated characters, and connect the two. This method may not apply to strings with character order requirements.
Javascript programs that extract file names from URLs using regular expressions. the following result is page1.
S = "http://www.9499.net/page1.htm"
S = s. replace (/(. */) {0,} ([^.] +). */ig, "$2 ")
Alert (s)
Use regular expressions to restrict text box input in a webpage form:
You can only enter Chinese characters using regular expressions: onkeyup = "value = value. replace (/[^ u4E00-u9FA5]/g, '')" onbeforepaste = "clipboardData. setData ('text', clipboardData. getData ('text '). replace (/[^ u4E00-u9FA5]/g ,''))"
You can only enter the full-width characters: onkeyup = "value = value. replace (/[^ uFF00-uFFFF]/g, '')" onbeforepaste = "clipboardData. setData ('text', clipboardData. getData ('text '). replace (/[^ uFF00-uFFFF]/g ,''))"
Use a regular expression to limit that only numbers can be entered: onkeyup = "value = value. replace (/[^ d]/g, '')" onbeforepaste = "clipboardData. setData ('text', clipboardData. getData ('text '). replace (/[^ d]/g ,''))"
You can only enter numbers and English letters using regular expressions: onkeyup = "value = value. replace (/[W]/g, '')" onbeforepaste = "clipboardData. setData ('text', clipboardData. getData ('text '). replace (/[^ d]/g ,''))"
How to use regular expressions to represent Chinese Characters
Because the Chinese ASCII code has a certain range. Therefore, you can use the following regular expression to represent Chinese characters.
/^ [Chr (0xa1)-chr (0xff)] + $/
The following is an example:
$ Str = "Beyond PHP ";
If (preg_match ("/^ [". chr (0xa1). "-". chr (0xff). "] + $/", $ str )){
Echo "this is a pure Chinese string ";
} Else {
Echo "This is not a pure Chinese string ";
}
Regular Expression
If you have never used a regular expression, you may not be familiar with this term or concept. However, they are not as novel as you think.
Remember how to find files on the hard disk. Are you sure you want to use? And * characters to help find the file you are looking .? Character matches a single character in the file name, while * matches one or more characters. A file such as 'data ?. The following files can be found in the dat mode:
Data1.dat
Data2.dat
Datax. dat
DataN. dat
If "*" is used instead? Characters to expand the number of files found. 'Data *. dat 'can match all the following file names:
Data. dat
Data1.dat
Data2.dat
Data12.dat
Datax. dat
DataXYZ. dat
Although this file search method is certainly very useful, it is also very limited .? The limited capabilities of wildcard and * enable you to define what a regular expression can do. However, regular expressions are more powerful and flexible.
--------------------------------------------------------------------------------
2
Early Origins
The "Ancestor" of regular expressions can be traced back to early studies on how the human nervous system works. Warren McCulloch and Walter Pitts, two neuroscientists, developed a mathematical method to describe these neural networks.
In 1956, an American mathematician named Stephen Kleene published a paper titled "neural network event representation" based on McCulloch and Pitts's early work, introduces the concept of regular expressions. A regular expression is an expression used to describe the algebra of a positive set. Therefore, the regular expression is used.
Later, it was found that this work could be applied to some early research using Ken Thompson's computational search algorithm, which is the main inventor of Unix. The first utility of regular expressions is the qed editor in Unix.
As they said, the rest is the well-known history. Since then, regular expressions have been an important part of text-based editors and search tools.
--------------------------------------------------------------------------------
3.
Use Regular Expressions
In typical search and replacement operations, you must provide the exact text to be searched. This technology may be sufficient for simple search and replacement tasks in static text, but it is difficult or even impossible to search dynamic text due to its lack of flexibility.
With a regular expression, you can:
Test a mode of a string. For example, you can test an input string to see if there is a phone number or a credit card number. This is called Data Validity verification.
Replace text. You can use a regular expression in a document to identify a specific text, and then delete it all or replace it with another text.
Extract a substring from the string based on the pattern match. It can be used to search for specific text in text or input fields.
For example, if you need to search the entire web site to delete outdated materials and replace some HTML formatting tags, you can use a regular expression to test each file, check whether there are materials or HTML formatting tags in the file. With this method, you can narrow down the affected files to the files that contain the materials to be deleted or changed. Then, you can use a regular expression to delete outdated materials. Finally, you can use a regular expression to find and replace the tags that need to be replaced.
Another example that describes the usefulness of regular expressions is a language with unknown string processing capabilities. VBScript is a subset of Visual Basic and has rich string processing functions. Visual Basic Scripting Edition similar to C does not have this capability. Regular Expressions significantly improve the string processing capability of Visual Basic Scripting Edition. However, it may be more efficient to use regular expressions in VBScript. It allows multiple string operations in a single expression.
--------------------------------------------------------------------------------
Regular expression syntax
A regular expression is a text mode consisting of common characters (such as characters a to z) and special characters (such as metacharacters. This mode describes one or more strings to be matched when searching the text subject. A regular expression is used as a template to match a character pattern with the searched string.
Here are some examples of regular expressions that may be encountered:
Visual Basic Scripting Edition VBScript match
/^ [T] * $/"^ [t] * $" matches a blank row.
/D {2}-d {5}/"d {2}-d {5}" verify that an ID number is composed of two digits, A hyphen and a five-digit combination.
/<(. *)>. * </1>/"<(. *)>. * </1>" matches an HTML Tag.
The following table shows a complete list of metacharacters and their behaviors in the context of a regular expression:
Character Description
Mark the next character as a special character, a literal character, a back reference, or an octal escape character. For example, 'n' matches the character "n ". 'N' matches a line break. The sequence ''matches" "and" ("matches "(".
^ Matches the start position of the input string. If the Multiline attribute of the RegExp object is set, ^ matches the position after 'n' or 'R.
$ Matches the end position of the input string. If the Multiline attribute of the RegExp object is set, $ also matches the location before 'n' or 'R.
* Matches the previous subexpression zero or multiple times. For example, zo * can match "z" and "zoo ". * Is equivalent to {0 ,}.
+ Match the previous subexpression once or multiple times. For example, 'Zo + 'can match "zo" and "zoo", but cannot match "z ". + Is equivalent to {1 ,}.
? Match the previous subexpression zero or once. For example, "do (es )? "Can match" do "in" do "or" does ".? It is equivalent to {0, 1 }.
{N} n is a non-negative integer. Match n times. For example, 'O {2} 'cannot match 'O' in "Bob", but can match two o in "food.
{N,} n is a non-negative integer. Match at least n times. For example, 'O {2,} 'cannot match 'O' in "Bob", but can match all o in "foooood. 'O {1,} 'is equivalent to 'o + '. 'O {0,} 'is equivalent to 'o *'.
Both {n, m} m and n are non-negative integers, where n <= m. Match at least n times and at most m times. Liu, "o {1, 3}" will match the first three o in "fooooood. 'O {0, 1} 'is equivalent to 'o? '. Note that there must be no space between a comma and two numbers.
? When this character is followed by any other delimiter (*, + ,?, The matching mode after {n}, {n ,}, {n, m}) is not greedy. The non-Greedy mode matches as few searched strings as possible, while the default greedy mode matches as many searched strings as possible. For example, for strings "oooo", 'O ++? 'Will match a single "o", and 'O +' will match all 'O '.
. Match any single character except "n. To match any character including 'n', use a pattern like '[. n.
(Pattern) matches pattern and obtains this match. The obtained match can be obtained from the generated Matches set. The SubMatches set is used in VBScript, and $0… is used in Visual Basic Scripting Edition... $9 attribute. To match parentheses, use '(' or ')'.
(? : Pattern) matches pattern but does not get the matching result. That is to say, this is a non-get match and is not stored for future use. This is useful when you use the "or" character (|) to combine each part of a pattern. For example, 'industr (? : Y | ies) is a simpler expression than 'industry | industries.
(? = Pattern) Forward pre-query: matches the search string at the beginning of any string that matches pattern. This is a non-get match, that is, the match does not need to be obtained for future use. For example, 'windows (? = 95 | 98 | NT | 2000) 'can match "Windows" in "Windows 2000", but cannot match "Windows" in "Windows 3.1 ". Pre-query does not consume characters, that is, after a match occurs, the next matching search starts immediately after the last match, instead of starting after the pre-query characters.
(?! Pattern) Negative pre-query, match the search string at the start of any string that does not match Negative lookahead matches the search string at any point where a string not matching pattern. This is a non-get match, that is, the match does not need to be obtained for future use. For example, 'windows (?! 95 | 98 | NT | 2000) 'can match "Windows" in "Windows 3.1", but cannot match "Windows" in "Windows 2000 ". Pre-query does not consume characters. That is to say, after a match occurs, the next matching search starts immediately after the last match, instead of starting after the pre-query characters.
X | y matches x or y. For example, 'z | food' can match "z" or "food ". '(Z | f) ood' matches "zood" or "food ".
[Xyz] Character Set combination. Match any character in it. For example, '[abc]' can match 'A' in "plain '.
[^ Xyz] combination of negative character sets. Match any character not included. For example, '[^ abc]' can match 'p' in "plain '.
[A-z] character range. Matches any character in the specified range. For example, '[a-z]' can match any lowercase letter in the range of 'A' to 'Z.
[^ A-z] negative character range. Matches any character that is not within the specified range. For example, '[^ a-z]' can match any character that is not in the range of 'A' to 'Z.
B matches a word boundary, that is, the position between a word and a space. For example, 'erb' can match 'er' in "never", but cannot match 'er' in "verb '.
B matches non-word boundaries. 'Erb' can match 'er' in "verb", but cannot match 'er' in "never '.
Cx matches the control characters specified by x. For example, cM matches a Control-M or carriage return character. The value of x must be either a A-Z or a-z. Otherwise, c is treated as an original 'C' character.
D matches a numeric character. It is equivalent to [0-9].
D. match a non-numeric character. It is equivalent to [^ 0-9].
F matches a form feed. It is equivalent to x0c and cL.
N matches a linefeed. It is equivalent to x0a and cJ.
R matches a carriage return. It is equivalent to x0d and cM.
S matches any blank characters, including spaces, tabs, and page breaks. It is equivalent to [fnrtv].
S matches any non-blank characters. It is equivalent to [^ fnrtv].
T matches a tab. It is equivalent to x09 and cI.
V matches a vertical tab. It is equivalent to x0b and cK.
W matches any word characters that contain underscores. It is equivalent to '[A-Za-z0-9 _]'.
W matches any non-word characters. It is equivalent to '[^ A-Za-z0-9 _]'.
Xn matches n, where n is the hexadecimal escape value. The hexadecimal escape value must be determined by the length of two numbers. For example, 'x41' matches "". 'X041' is equivalent to 'x04 '& "1 ". The regular expression can use ASCII encoding ..
Num matches num, where num is a positive integer. References to the obtained matching. For example, '(.) 1' matches two consecutive identical characters.
N identifies an octal escape value or a backward reference. If n contains at least n obtained subexpressions, n is a back reference. Otherwise, if n is an octal digit (0-7), n is an octal escape value.
Nm identifies an octal escape value or a backward reference. If there are at least is preceded by at least nm obtained subexpressions before nm, then nm is backward reference. If at least n records are obtained before nm, n is a backward reference followed by m. If none of the preceding conditions are met, if n and m are Octal numbers (0-7), nm matches the octal escape value nm.
If n is an octal digit (0-3) and m and l are octal numerals (0-7), nml matches the octal escape value nml.
Un matches n, where n is a Unicode character represented by four hexadecimal numbers. For example, u00A9 matches the copyright symbol (?).
4.
Create a regular expression
The method for constructing a regular expression is the same as that for creating a mathematical expression. That is, a larger expression is created by combining a small expression with a variety of metacharacters and operators.
You can construct a regular expression by placing various components in expression mode between a pair of delimiters. For Visual Basic Scripting Edition, The Delimiter is a forward slash (/) character. For example:
/Expression/
For VBScript, a pair of quotation marks ("") are used to determine the boundary of the regular expression. For example:
"Expression"
In the two examples shown above, the regular expression mode (expression) is stored in the Pattern attribute of the RegExp object.
The regular expression component can be a single character, Character Set combination, character range, choice between characters, or any combination of all these components.
--------------------------------------------------------------------------------
5.
Priority Order
After constructing a regular expression, you can evaluate the value like a mathematical expression, that is, you can evaluate the value from left to right in a priority order.
The following table lists the priority orders of various regular expression operators from the highest priority to the lowest priority:
Operator description
Escape Character
(),(?, (? =), [] Parentheses and square brackets
*, + ,?, {N}, {n ,}, {n, m} qualifier
^, $, Anymetacharacter location and Sequence
| "Or" Operation
--------------------------------------------------------------------------------
6.
Common characters
A common character consists of all the print and non-print characters that are not explicitly specified as metacharacters. This includes all uppercase and lowercase letter characters, all numbers, all punctuation marks, and some symbols.
The simplest regular expression is a single normal character that can match the character itself in the searched string. For example, the single-Character Mode 'A' can match the 'A' letter that appears at any position in the searched string '. Here are some examples of Single-character regular expression patterns:
//
/7/
/M/
The equivalent single-character Regular Expression of VBScript is:
""
"7"
"M"
You can combine multiple single characters to obtain a large expression. For example, the following Visual Basic Scripting Edition regular expression is an expression created by combining a single character expression 'A', '7', and 'M.
/A7M/
The equivalent VBScript expression is:
"A7M"
Note that there is no join operator here. All you need to do is place one character after the other.
--------------------------------------------------------------------------------
Special characters
Many metacharacters require special processing when trying to match them. To match these special characters, you must first escape these characters, that is, use a backslash (). The following table lists the special characters and their meanings:
Special characters
$ Matches the end position of the input string. If the Multiline attribute of the RegExp object is set, $ also matches 'n' or 'R '. To match the $ character, use $.
() Mark the start and end positions of a subexpression. Subexpressions can be obtained for future use. To match these characters, use (and ).
* Matches the previous subexpression zero or multiple times. To match * characters, use *.
+ Match the previous subexpression once or multiple times. To match + characters, use +.
. Match any single character except linefeed n. To match., use.
[Mark the start of a bracket expression. To match [, use [.
? Match the previous subexpression zero or once, or specify a non-Greedy qualifier. To match? Character, please use ?.
Mark the next character as a special character, or a literal character, or backward reference, or an octal escape character. For example, 'n' matches the character 'n '. 'N' matches the linefeed. The sequence ''matches" ", while '(' matches "(".
^ Matches the start position of the input string. Unless used in the square brackets expression, this character set is not accepted. To match the ^ character itself, use ^.
{Mark the start of the qualifier expression. To match {, use {.
| Specify an option between the two items. To match |, use |.
--------------------------------------------------------------------------------

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Page 1/2, one of the PHP and regular expression tutorials

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Page 1/2, one of the PHP and regular expression tutorials

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support