The use of regular expressions is detailed
Brief introduction
Simply put, regular expressions are a powerful tool that can be used for pattern matching and substitution. The function is as follows:
Tests a pattern for a string. For example, you can test an input string to see if there is a phone number pattern in the string
Or a credit card number mode. This is called data validation.
Replace the text. You can use a regular expression in your document to identify specific text, and then you can either delete it all or replace it with a different text
Word.
Extracts a substring from a string based on pattern matching. Can be used to find specific text in text or input fields.
Basic syntax
After a preliminary understanding of the function and function of regular expressions, let's take a specific look at the syntax format of the regular expression.
Regular expressions are generally in the following form:
/love/the part where the "/" delimiter is located is the pattern that will be matched in the target object. Users just want to find a horse
The pattern contents of the matching object are placed between the "/" delimiters. To enable users to customize the pattern content more flexibly, the regular expression provides a
A specialized "metacharacters". A meta-character is a special character in a regular expression that can be used to specify its leading
Characters (that is, characters that precede the metacharacters) appear in the target object.
The more commonly used metacharacters are: "+", "*", and "?".
The "+" metacharacters stipulate that their leading characters must appear one or more times in the target object.
The "*" meta-character specifies that its leading character must appear 0 or more times in the target object.
“?” A meta-character specifies that its leading object must appear 0 or more times in the target object.
Below, let's look at the specific application of the regular expression meta-character.
/fo+/because the preceding regular expression contains the "+" metacharacters, it can be associated with "fool", "fo" in the target object, or
"Football" matches a string of one or more letters o after the letter F.
/eg*/because the preceding regular expression contains the "*" meta-character, it can be compared with the target object "easy", "ego", or "
Egg "waits for a string that appears 0 or more letters G after the letter E to match.
/wil?/because the preceding regular expression contains the "? "Meta-character, which means" Win "in the target object, or" Wilson ",
A string equal to 0 or one letter L appears after the letter I.
Sometimes you don't know how many characters to match. To be able to adapt to this uncertainty, the regular expression supports the concept of qualifiers. These qualifiers can be
Specifies how many times a given component of a regular expression must appear to satisfy a match.
{n} n is a non-negative integer. Matches the determined n times. For example, ' o{2} ' cannot match ' o ' in ' Bob ', but can match ' food '
In the two O.
{N,} n is a non-negative integer. Match at least n times. For example, ' o{2,} ' cannot match ' o ' in ' Bob ', but can match ' Foooood '
All of the O. ' O{1,} ' is equivalent to ' o+ '. ' O{0,} ' is equivalent to ' o* '.
{n,m} m and n are non-negative integers, where n <= m. Matches at least n times and matches up to M times. For example, "o{1,3}" will match
Top three O in "Fooooood". ' o{0,1} ' is equivalent to ' O? '. Note that there can be no spaces between a comma and two numbers.
In addition to metacharacters, users can specify exactly how often the pattern appears in the matching object. For example,/jim {2,6}/The above regular expression
The specified character M can appear consecutively 2-6 times in a matching object, so the above regular expression can match a string such as Jimmy or Jimmmmmy
。
After a preliminary understanding of how to use regular expressions, let's take a look at some of the other important metacharacters uses.
\s: Used to match a single space character, including Tab key and line break;
\s: Used to match all characters except a single space character;
\d: Used to match numbers from 0 to 9;
\w: Used to match letters, numbers, or underscore characters;
\w: Used to match all characters that do not match the \w;
. : Used to match all characters except the line break.
(Note: We can think of \s and \s as well as \w and \w as inverse for each other)
Below, we'll look at how to use the above metacharacters in regular expressions using an example.
/\s+/the preceding regular expression can be used to match one or more space characters in the target object.
/\d000/If we have a complex financial statement in hand, we can easily find all the total
amounted to thousand dollars in money.
In addition to the meta-characters we have described above, there is another unique special character in the regular expression, the locator. Locator characters
Used to specify where the matching pattern appears in the target object. The more commonly used locators include: "^", "$", "\b" and "\b
”。
The "^" locator specifies that the matching pattern must be at the beginning of the target string
The "$" locator specifies that the matching pattern must be at the end of the target object
The "\b" locator specifies that the matching pattern must be one of the two boundaries at the beginning or end of the target string.
The "\b" locator specifies that the matching object must be within two boundaries of the beginning and end of the target string, that is, the matching object is neither a target
The beginning of the string and cannot be the end of the target string. Similarly, we can also think of "^" and "$" and "\b" and "\b" as
are two sets of locators that are mutually inverse operations. For example:/^hell/because the above regular expression contains the "^" Locator, so you can
Matches the string starting with "Hell", "Hello" or "Hellhound" in the target object. /ar$/because the above-mentioned regular expression in the package
Contains a "$" locator, so you can match a string that ends with "car", "bar" or "AR" in the target object. /\bbom/
Because the regular expression pattern above starts with the "\b" locator, you can match the characters that start with "bomb" or "BOM" in the target object
String matches. /man\b/because the preceding regular expression pattern ends with a "\b" locator, you can "human" with the target object,
Matches the string ending with "woman" or "man".
In order to make it easier for users to set the matching pattern, the regular expression allows the user to specify a range in the matching pattern without limiting
to the specific character. For example:
/[a-z]/the above regular expression will match any uppercase letter from a to Z range.
/[a-z]/the above regular expression will match any lowercase letter from a to Z range.
/[0-9]/the above regular expression will match any number in the range from 0 to 9.
/([a-z][a-z][0-9]) +/the above regular expression will match any string consisting of letters and numbers, such as "aB0".
One thing you should be reminded of here is that you can use "()" in regular expressions to group strings together. The "()" symbol contains
Content must appear in the target object at the same time. Therefore, the above regular expression will not match the string such as "ABC", because
The last character in "ABC" is a letter rather than a number.
If we want to implement a "or" operation in a regular expression similar to a programming logic, choose one of the different modes to match
, you can use the pipe symbol "|". For example:/to|too|2/the above regular expression will be associated with "to", "too" in the target object,
or "2" matches.
There is also a more commonly used operator in the regular expression, the negative character "[^]". Unlike the locator "^" we described earlier, no
The fixed character "[^]" specifies that a string specified in the pattern cannot exist in the target object. For example:/[^a-c]/the above string will be aligned with the target
The image is in addition to a, B, and any character other than C. In general, when "^" appears in "[]" is considered as a negation operator;
"^" is located outside of "[]" or "[]", then it should be treated as a locator.
Finally, the escape character "\" can be used when the user needs to include metacharacters in the pattern of the regular expression and find the matching object. For example:
/th\*/the above regular expression will match the "th*" in the target object, not the "the", and so on.
After the regular expression is constructed, it can be evaluated like a mathematical expression, that is, from left to right and in a priority order
To evaluate the value. The priority levels are as follows:
1. \ escape Character
2. (), (?, (? =), [] parentheses and square brackets
3. *, +,?, {n}, {n,}, {n,m} qualifier
4. ^, $, \anymetacharacter position and order
5.| " or the action
Working with instances
In Java Script 1.2, there is a powerful regexp () object that can be used to perform matching operations on regular expressions. One of the
The test () method can verify that the target object contains a matching pattern and returns TRUE or false accordingly.
We can use Java script to write the following scripts to verify the validity of the e-mail addresses entered by the user.
--------------------------------------------------------
<script language= "java script1.2" >
<!--start hiding
function verifyaddress (obj)
{
var email = obj.email.value;
var pattern =
/^ ([a-za-z0-9_-]) [email protected] ([a-za-z0-9_-]) + (\.[ A-za-z0-9_-]) +/;
Flag = pattern.test (email);
if (flag)
{
Alert ("Your email address is correct!");
return true;
}
Else
{
Alert ("Please try again!");
return false;
}
}
Stop Hiding--
</script>
<body>
<form onsubmit= "return verifyaddress (this);" >
<input name= "Email" type= "text" >
<input type= "Submit" >
</form>
</body>
Regular Expression Object
This object contains the regular expression pattern and flags that indicate how the pattern is applied.
Syntax 1 re =/pattern/[flags]
Syntax 2 re = new RegExp ("pattern", ["flags"])
Parameters
Re
Required option. The variable name that will be assigned to the regular expression pattern.
Pattern
Required option. The regular expression pattern to use. If you use Syntax 1, separate the pattern with the "/" character. If you use Syntax 2, enclose the pattern in quotation marks
to cause.
Flags
Options are available. If you use Syntax 2, enclose the flag in quotation marks. Flags can be used in combination and are available:
G (Full text lookup for all occurrences of pattern)
I (ignoring case)
m (Multi-line lookup)
Example
The following example creates an object (re) that contains a regular expression pattern and related flags to show you the use of a regular expression object. In this case
, the regular expression object as a result is also used in the match method:
function Matchdemo ()
{
var r, re; Declares a variable.
var s = "The rain in Spain falls mainly in the plain";
Re = new RegExp ("Ain", "G"); Creates a regular expression object.
r = S.match (re); Finds a match in the string s.
return (R);
}
return value: Ain,ain,ain,ain
Property LastIndex Property | Source Property
Method Compile Method | exec Method | Test method
Request Version 3
See RegExp Objects | Regular-expression Syntax | String Object
exec method
Runs a lookup in a string with a regular expression pattern and returns an array containing the results of the lookup.
Rgexp.exec (str)
Parameters
Rgexp
Required option. A regular expression object that contains the regular expression pattern and the available flags.
Str
Required option. The string object or string literal in which to perform the lookup.
Description
If the Exec method does not find a match, it returns NULL. If it finds a match, the Exec method returns an array and updates the full
The RegExp object's properties to reflect the matching result. The 0 elements of the array contain a complete match, while the 1th to n elements contain the matching
Any one of the sub-matches that appear. This is equivalent to the match method without setting the global flag (g).
If the global flag is set for a regular expression, exec starts by looking at the location indicated by the value of LastIndex. If the global label is not set
Log, Exec ignores the value of LastIndex and starts the search from the beginning of the string.
The array returned by the Exec method has three properties, namely input, index, and LastIndex. The Input property contains the entire searched word.
Character string. The Index property contains the position of the matched substring in the entire lookup string. The LastIndex property contains a match in the
The next position of the last character.
Example
The following example illustrates the use of the Exec method:
function Regexptest ()
{
var ver = number (ScriptEngineMajorVersion () + "." + ScriptEngineMinorVersion ())
if (ver >= 5.5) {//test the version of JScript.
var src = "The rain in Spain falls mainly in the plain.";
var re =/\w+/g; Creates a regular expression pattern.
var arr;
while (arr = re.exec (src)) = null)
document.write (Arr.index + "-" + Arr.lastindex + arr + "\ t");
}
else{
Alert ("Please use the updated version of JScript");
}
}
return value: 0-3the 4-8rain 9-11in 12-17spain 18-23falls 24-30mainly 31-33in 34-37the 38-43plain
Test method
Returns a Boolean value that indicates whether a pattern exists in the string being looked up.
Rgexp.test (str)
Parameters
Rgexp
Required option. A regular expression object that contains a regular expression pattern or an available flag.
Str
Required option. The string on which to test the lookup.
Description
The test method checks whether a pattern exists in the string, returns True if it exists, or returns false.
The properties of the global RegExp object are not modified by the test method.
Example
The following example illustrates the use of the test method:
function Testdemo (Re, s)
{
var S1; Declares a variable.
Checks whether the string has a regular expression.
if (Re.test (s))//test is present.
S1 = "contains"; s contains the pattern.
Else
S1 = "does not contain"; s does not contain a pattern.
Return ("'" + S + "'" + S1 + "'" + Re.source + "'"); Returns a string.
}
Function call: document.write (Testdemo (/ain+/, "The rain in Spain falls mainly in the plain."));
Return value: ' The rain in Spain falls mainly in the plain. ' Contains ' ain+ '
Match method
Use the regular expression pattern to perform a lookup on a string and return the result that contains the lookup as an array.
Stringobj.match (RGEXP)
Parameters
Stringobj
Required option. A string object or string literal on which to find.
Rgexp
Required option. is a regular expression object that contains the regular expression pattern and the available flags. You can also include the regular expression pattern and the available flags.
Variable name or string literal.
Description
Returns null if no match is found for the match method. Returns an array if a match is found and updates the properties of the global RegExp object
To reflect the matching results.
The array returned by the match method has three properties: input, index, and LastIndex. The Input property contains the entire searched string.
The Index property contains the position of the substring that matches the entire searched string. The LastIndex property contains the last occurrence of the most
The next position of the last character.
If the global flag (g) is not set, the 0 elements of the array contain the entire match, while the 1th to n elements contain the any that have occurred in the match
Sub-match. This is equivalent to an Exec method that does not have a global flag set. If a global flag is set, elements 0 through n contain all matching
。
Example
The following example shows the use of the match method:
function Matchdemo ()
{
var r, re; Declares a variable.
var s = "The rain in Spain falls mainly in the plain";
re =/ain/i; Creates a regular expression pattern.
r = S.match (re); Try to match the search string.
return (R); Return to the place where "Ain" first appeared.
}
return value: Ain
This example illustrates the use of the match method with the G flag set.
function Matchdemo ()
{
var r, re; Declares a variable.
var s = "The rain in Spain falls mainly in the plain";
re =/ain/ig; Creates a regular expression pattern.
r = S.match (re); Try to match the search string.
return (R); The returned array contains all of the "Ain"
Four matches that appear.
}
return value: Ain,ain,ain,ain
The previous lines of code demonstrate the use of the match method for string literals.
var r, re = "Spain";
r = "The Rain in Spain". Replace (Re, "Canada");
return R;
Return value: The rain in Canada
Search method
Returns the position of the first substring that matches the regular expression find content.
Stringobj.search (RGEXP)
Parameters
Stringobj
Required option. The string object or string literal on which to look.
Rgexp
Required option. A regular expression object that contains the regular expression pattern and the available flags.
Description
The search method indicates whether there is a corresponding match. If a match is found, the search method returns an integer value that indicates the match
The offset from where the string begins. If no match is found, 1 is returned.
Example
The following example shows the use of the search method.
function Searchdemo ()
{
var r, re; Declares a variable.
var s = "The rain in Spain falls mainly in the plain.";
re =/falls/i; Creates a regular expression pattern.
r = S.search (re); Finds a string.
return (R); Returns a Boolean result.
}
return value: 18
Regular expression syntax
A regular expression is a text pattern consisting of ordinary characters, such as characters A through z, and special characters (called metacharacters). The mode
Describes one or more strings to match when looking up a text body. A regular expression as a template that converts a character pattern to the search
String to match.
Here are some examples of regular expressions that you might encounter:
JScript VBScript Matching
/^\[\t]*$/"^\[\t]*$" matches a blank line.
/\d{2}-\d{5}/"\d{2}-\d{5}" verifies whether an ID number consists of a 2-digit number, a hyphen, and a 5-digit number.
/< (. *) >.*<\/\1>/"< (. *) >.*<\/\1>" matches an HTML tag.
The following table is a complete list of metacharacters and its behavior in the context of regular expressions:
Character description
\ marks the next character as a special character, or a literal character, or a back reference, or an octal escape character. Cases
For example, ' n ' matches the character "n". ' \ n ' matches a line break. The sequence ' \ \ ' matches "\" and "\ (" Matches "(".
^ matches the starting position of the input string. If the Multiline property of the RegExp object is set, ^ also matches ' \ n ' or ' \ R '
The location.
$ matches the end position of the input string. If the Multiline property of the RegExp object is set, $ also matches ' \ n ' or ' \ R ' before
The location.
* matches the preceding subexpression 0 or more times. For example, zo* can match "z" and "Zoo". * Equivalent to {0,}.
+ matches the preceding subexpression one or more times. For example, ' zo+ ' can match "Zo" and "Zoo", but not "Z". + Equivalent to
{1,}.
? Matches the preceding subexpression 0 or one time. For example, "Do (es)?" can match "do" in "do" or "does".? Equivalent to
{0,1}.
{n} n is a non-negative integer. Matches the determined n times. For example, ' o{2} ' cannot match ' o ' in ' Bob ', but can match ' food '
In the two O.
{N,} n is a non-negative integer. Match at least n times. For example, ' o{2,} ' cannot match ' o ' in ' Bob ', but can match ' Foooood '
All of the O. ' O{1,} ' is equivalent to ' o+ '. ' O{0,} ' is equivalent to ' o* '.
{n,m} m and n are non-negative integers, where n <= m. Matches at least n times and matches up to M times. Liu, "o{1,3}" will match
Top three O in "Fooooood". ' o{0,1} ' is equivalent to ' O? '. Note that there can be no spaces between a comma and two numbers.
? When the character immediately follows any other restriction (*, +,?, {n}, {n,}, {n,m}), the matching pattern is non-greedy. Non-greedy
The greed pattern matches as few strings as you search for, while the default greedy pattern matches as many strings as you search. For example, for
The string "Oooo", ' o+? ' will match a single "O", while ' o+ ' will match all ' o '.
. Matches any single character except "\ n". To match any character including ' \ n ', use a pattern like ' [. \ n] '.
Pattern matches the pattern and gets the match. The obtained match can be obtained from the resulting Matches collection, in VBScript
Using the Submatches collection, the $0...$9 property is used in JScript. To match the parentheses character, use ' \ (' or ' \ ').
(?:p Attern) matches the pattern but does not get a matching result, which means that this is a non-fetch match and is not stored for later use. This
It is useful to combine parts of a pattern using the or character (|). For example, ' Industr (?: y|ies) is a
' Industry|industries ' more abbreviated expressions.
(? =pattern) forward, matching the lookup string at the beginning of any string that matches the pattern. This is a non-acquisition match, and it is
That is, the match does not need to be acquired for later use. For example, ' Windows (? =95|98| nt|2000) ' Can match ' Windows 2000 '
Windows, but does not match Windows 3.1. Pre-check does not consume characters, i.e., after a match occurs
To start the next matching search immediately after the last match, rather than starting with the character that contains the pre-check.
(?! pattern) negative to pre-check, in any mismatch negative lookahead matches the search string at any point
Where a string that is not matching pattern matches the lookup string at the beginning of the string. This is a non-fetch match, which means
, the match does not need to be acquired for later use. For example ' Windows (?! 95|98| nt|2000) ' Can match ' Windows 3.1 '
Windows, but does not match Windows 2000. Pre-check does not consume characters, i.e., after a match occurs
To start the next matching search immediately after the last match, rather than starting with the character that contains the pre-check
X|y matches x or Y. For example, ' Z|food ' can match "z" or "food". ' (z|f) Ood ' matches "Zood" or "food".
[XYZ] Character set. Matches any one of the characters contained. For example, ' [ABC] ' can match ' a ' in ' plain '.
[^XYZ] negative character set. Matches any character that is not contained. For example, ' [^ABC] ' can match ' P ' in ' plain '.
A [A-z] character range. Matches any character within the specified range. For example, ' [A-z] ' can match any of the lowercase letters in the ' a ' to ' Z ' range
The female character.
[^a-z] negative character range. Matches any character that is not in the specified range. For example, ' [^a-z] ' can match anything not in ' a ' to
Any character within the ' Z ' range.
\b Matches a word boundary, which is the position between a word and a space. For example, ' er\b ' can match ' er ' in ' never ', but not
Can match ' er ' in ' verb '.
\b Matches a non-word boundary. ' er\b ' can match ' er ' in ' verb ', but cannot match ' er ' in ' Never '.
\CX matches the control character indicated by X. For example, \cm matches a control-m or carriage return. The value of x must be a-Z or a-Z
One. Otherwise, c is treated as a literal ' C ' character.
\d matches a numeric character. equivalent to [0-9].
\d matches a non-numeric character. equivalent to [^0-9].
\f matches a page break. Equivalent to \x0c and \CL.
\ n matches a line break. Equivalent to \x0a and \CJ.
\ r matches a carriage return character. Equivalent to \x0d and \cm.
\s matches any whitespace character, including spaces, tabs, page breaks, and so on. equivalent to [\f\n\r\t\v].
\s matches any non-whitespace character. equivalent to [^ \f\n\r\t\v].
\ t matches a tab character. Equivalent to \x09 and \ci.
\v matches a vertical tab. Equivalent to \x0b and \ck.
\w matches any word character that includes an underscore. Equivalent to ' [a-za-z0-9_] '.
\w matches any non-word character. Equivalent to ' [^a-za-z0-9_] '.
\XN matches N, where n is the hexadecimal escape value. The hexadecimal escape value must be two digits long for a determination. For example, ' \x41 ' matches
"A". ' \x041 ' is equivalent to ' \x04 ' & ' 1 '. ASCII encoding can be used in regular expressions:
\num matches num, where num is a positive integer. A reference to the obtained match. For example, ' (.) \1 ' matches two consecutive identical words
Character.
\ n identifies an octal escape value or a back reference. N is a back reference if \ n has at least one of the first obtained sub-expressions. Whether
Then, if n is an octal number (0-7), N is an octet escape value.
\NM identifies an octal escape value or a back reference. If at least \nm was preceded by at least NM
Sub-expression, the NM is a back reference. If there are at least N fetches before \nm, then N is a back reference followed by the literal m. Such as
If both N and M are octal digits (0-7), then \nm will match the octal escape value nm.
\NML if n is an octal number (0-3) and both M and L are octal digits (0-7), the octal escape value NML is matched.
\un matches N, where N is a Unicode character represented by four hexadecimal digits. For example, \u00a9 matches the copyright symbol (?)
。
Priority order
After the regular expression is constructed, it can be evaluated like a mathematical expression, that is, from left to right and in a priority order
To evaluate the value.
The following table lists the priority order of the various regular expression operators from highest priority to lowest priority:
Operator description
\ escape Character
(), (?, (? =), [] parentheses and square brackets
*, +,?, {n}, {n,}, {n,m} qualifier
^, $, \anymetacharacter position and order
| "or" action
Normal characters
Ordinary characters consist of all printed and non-printable characters that are not explicitly specified as metacharacters. This includes all uppercase and lowercase alphabetic characters,
There are numbers, all punctuation marks, and some symbols.
The simplest regular expression is a single ordinary character that matches the character itself in the searched string. For example, single-character mode
' A ' can match the letter ' a ' that appears anywhere in the searched string. Here are some examples of the word regular expression pattern:
/a/
/7/
/m/
The equivalent VBScript word regular expression is:
A
"7"
M
You can combine multiple single-character characters together to get a larger expression. For example, the following JScript regular expression is not something else, it is a pass-through
An expression created by combining single-character expressions ' a ', ' 7 ', and ' M '.
/a7m/
The equivalent VBScript expression is:
"A7m"
Note that there are no connection operators here. All you need to do is put one character behind the other.
Common Regular Expressions
Whether the checksum is all made up of numbers
function IsDigit (s)
{
var patrn=/^[0-9]{1,20}$/;
if (!patrn.exec (s)) return false
return True
}
Check login name: Only 5-20 entries begin with a letter, can be numbered, "_", "." The string
function Isregisterusername (s)
{
var patrn=/^[a-za-z]{1} ([a-za-z0-9]|[. _]) {4,19}$/;
if (!patrn.exec (s)) return false
return True
}
Verify user name: Only 1-30 strings beginning with a letter can be entered
function Istruename (s)
{
var patrn=/^[a-za-z]{1,30}$/;
if (!patrn.exec (s)) return false
return True
}
Check password: Only 6-20 letters, numbers, underscores can be entered
function ispasswd (s)
{
var patrn=/^ (\w) {6,20}$/;
if (!patrn.exec (s)) return false
return True
}
Check the ordinary telephone, fax number: Can "+" start, in addition to the number, can contain "-"
function Istel (s)
{
var patrn=/^[+]{0,1} (\d) {1,3}[]? ([-]? (\d) {1,12}) +$/;
var patrn=/^[+]{0,1} (\d) {1,3}[]? ([-]? ((\d) | []) {1,12}) +$/;
if (!patrn.exec (s)) return false
return True
}
Check mobile phone Number: Must start with a number, except the number, can contain "-"
function Ismobil (s)
{
var patrn=/^[+]{0,1} (\d) {1,3}[]? ([-]? ((\d) | []) {1,12}) +$/;
if (!patrn.exec (s)) return false
return True
}
Verifying ZIP Codes
function Ispostalcode (s)
{
var patrn=/^[a-za-z0-9]{3,12}$/;
var patrn=/^[a-za-z0-9]{3,12}$/;
if (!patrn.exec (s)) return false
return True
}
Verifying search Keywords
function Issearch (s)
{
var patrn=/^[^ ' [email protected]#$%^&* () +=|\\\][\]\{\}:; ' \,.<>/?] {1} [^ ' [email protected]$%^& () +=|\\\] [\]\{\}:;‘ \,.<>?]
{0,19}$/;
if (!patrn.exec (s)) return false
return True
}
function IsIP (s)//by zergling
{
var patrn=/^[0-9.] {1,20}$/;
if (!patrn.exec (s)) return false
return True
}
Regular expressions--Getting started with learning