Regular expression parsing in JavaScript
A regular expression is an object that describes a character pattern, and JavaScript's RegExp objects and string objects define methods that use regular expressions to perform powerful pattern matching and text solicitation and substitution functions.
In JavaScript, a regular expression is represented by a RegExp object, and you can create a RegExp object in two ways:
One is to use the RegExp () constructor to create a RegExp object.
The other is the text format. Use the newly added special syntax in javascript1.2 to create a RegExp object, just as a string literal is defined as a character enclosed in quotation marks, and the direct amount of the regular expression is also defined as a character that is contained between a pair of slashes (/). So JavaScript creates a RegExp object that might contain the following code:
var pattern=/s$/, or var pattern=new RegExp ("S $");
This line of code creates a new RegExp object and assigns it to the variable pattern. This special RegExp object matches all strings that end with the letter "s".
A text format or regular expression constructor
Text Format:/pattern/flags
Regular expression constructors: New RegExp ("pattern" [, "flags"]);
Parameter description:
Pattern--a regular expression text
Flags--if present, will be the following values:
G: Global Match
I: Ignore case
GI: Above combination
M: Multi-line matching
[note] The parameters of the text format are not quoted, and the arguments that are used when using the constructor require quotation marks.
1 Direct Volume characters
We have found that all alphabetic characters and numbers in regular expressions match the literal meaning of themselves. JavaScript's regular expressions also support some non-alphabetic characters by escaping sequences that begin with a backslash (\).
For example, the sequence "\ n" matches a direct volume line break in a string. In regular expressions, many punctuation marks have special meanings. Here are the characters and what they mean:
\ f Page Break
\ n line break
\ r Enter
\ t tab
\ v Vertical Tab
\/A/Direct volume
\ \ a \ Direct volume
\ . One. Direct volume
\ * One * direct volume
\ + one + direct volume
\ ? One? Direct volume
\ | A | Direct volume
\ (One (direct volume
\) A direct amount of
\ [one [direct volume]
\] A] direct volume
\ {One {direct volume
\} One} Direct volume
\-one-direct volume
\ XXX ASCII code character specified by decimal number XXX
\ Xnn ASCII code character specified by hexadecimal number nn
\ CX control character ^x. For example, \ci is equivalent to \ t, \CJ is equivalent to \ n
If you want to use special punctuation in regular expressions, you must precede them with a "\".
^ matches the beginning of the character, and in multi-line retrieval, matches the beginning of a line
$ matches the end of the character, and in multi-line retrieval, matches the end of a line
\b matches the boundary of a word. In short, it is the position between the characters \w and \w (note: [\b] matches backspace)
\b matches the character of a non-word boundary
2 Character class
You can combine individual direct characters into a character class by putting them inside brackets. A character class matches any one of the characters it contains, so the regular expression/[ABC]/And the letter "a", "B", and "C" are all matched. You can also define a negative character class, These classes match all characters except those contained within the brackets. When defining a negative character class, a ^ symbol is used as the first character from the left bracket.
Character classes for regular expressions:
[...] Any character that is within the parentheses. such as: [ABC]
[^...] Any character that is not in parentheses. such as: [^ABC]
. Any single character other than line break, equivalent to [^\n]
\w any single character, equivalent to [a-za-z0-9]
\w any non-single character, equivalent to [^a-za-z0-9]
\s any whitespace character, equivalent to [\ t \ n \ r \ f \ V]
\s any non-whitespace character, equivalent to [^\ t \ n \ r \ f \ V]
\d any number, equivalent to [0-9]
\d any character other than a number, equivalent to [^0-9]
[\b] A backspace direct volume (special case)
| The basic meaning of an operator is either arithmetic. T (A|e|i|oo) n means that Tan,ten,tin,toon can be matched.
(...) Represents a grouping in a regular expression. Divide several items into one unit. This unit can be used by *, +,? and |, and you can also remember the characters that match this group for subsequent references to use
\ n matches the characters of the nth grouping. Groupings are sub-expressions (possibly nested) in parentheses. The group number is the left-to-right count of the number of opening parentheses
Matches any character, including newline characters, and the following is the correct expression for any word, regular, to match the rule:
([\s\s]*)
It can also be expressed as "([\d\d]*)", "([\w\w]*)").
3 symbol for number of matches
{n, m} matches the previous item at least n times, but cannot exceed m times
{N,} matches the previous item n times, or multiple times
{n} matches the previous item exactly n times
? Matches the previous item 0 or 1 times, which means the previous item is optional. Equivalent to {0, 1}
+ matches the previous item 1 or more times, equivalent to {1,}
* matches the previous item 0 or more times. Equivalent to {0,}
For example:
/\d{2, 4}/matches numbers between 2 and 4.
/\W{3} \d?/matches three single character and an arbitrary number.
/\s+java\s+/matches the string "Java", and it can have one or more spaces before and after it.
/[^ "] */matches 0 or more non-quoted characters.
4 the properties of a regular expression
The syntax for regular expressions is also the last element, which is the property of the regular expression, which describes the rules for advanced pattern matching. Unlike other regular expression syntaxes, attributes are described outside the/symbol. That is, they do not appear between two slashes, but after the second slash. JavaScript 1.2 supports two properties.
The attribute I indicates that pattern matching should be case insensitive.
The property G shows that pattern matching should be global. That is, performing a global match, in short, finding all the matches, rather than stopping after the first one is found. These two properties can be combined to perform a global, case-insensitive match.
Static properties
The Index property. is the starting position of the first match for the current expression pattern, counting from 0. Its initial value is-1, and the Index property changes each time a successful match is made.
The input property. Returns the currently-acting string, abbreviated to $_, and the initial value as an empty string "".
The Lastindex property. Is the next position of the last character in the first match of the current expression pattern, counting from 0, which is often used as the starting position when the search is resumed, the initial value is-1, which means that the search starts from the starting position, and the value of the Lastindex property changes each time a successful match is made.
The Lastmatch property. is the last matching string for the current expression pattern, which can be abbreviated to $&. Its initial value is an empty string "". The value of the Lastmatch property changes with each successful match.
5 Methods of regular expressions
Test () method
RegExp object instance. Test (String)
Return value: Returns true if the regular rule defined in the RegExp instance is satisfied, otherwise false is returned.
EXEC () method
RegExp object instance. Exec (String)
Return value: Array if exec () finds the matching text, it returns an array of results. Otherwise, NULL is returned. The No. 0 element of this array is the first text that matches the regular expression, and if regexp is not grouped, the returned array contains only the No. 0 element, and if there is a grouping in regexp then the 1th element of the returned array is the text that matches the 1th grouping of regexpobject (if any) , the 2nd element is the text (if any) that matches the 2nd grouping of Regexpobject, and so on. In addition to the array element and the length property, the Exec () method returns two properties. The Index property declares the position of the first character of the matched text. The input property holds the string that was retrieved. We can see that when calling the Exec () method of a non-global REGEXP object, the returned array is the same as the array returned by the calling Method String.match ().
However, when Regexpobject is a global regular expression G, the behavior of exec () is slightly more complex. It will begin retrieving string strings at the character specified by the LastIndex property of Regexpobject. When exec () finds text that matches the expression, it sets the Regexpobject LastIndex property to the next position of the last character of the matched text after the match. This means that you can iterate through all the matching text in a string by calling the Exec () method repeatedly. When exec () can no longer find a matching text, it returns null and resets the LastIndex property to 0.
Note: If you want to start retrieving a new string after you have completed a pattern match in a string, you must manually reset the LastIndex property to 0.
String Object method:
Match () Method:
Usage: string object. Match (RegExp object)
return value: Array
The match () method retrieves the string stringobject to find one or more text that matches the regexp. The behavior of this method depends to a large extent on whether RegExp has a flag g.
If RegExp does not have a flag G, then the match () method can only perform a match in Stringobject. If no matching text is found, match () returns NULL. Otherwise, it returns an array that holds information about the matching text it finds. The No. 0 element of the array holds the text that matches the regular expression, and if regexp exists, the 1th element of the returned array is the text that matches the 1th grouping of regexpobject (if any), and the 2nd element is a group of 2nd with the Regexpobject Text (if any), and so on. In addition to these regular array elements, the returned array contains two object properties. The Index property declares the position of the starting character of the matching text in Stringobject, and the input property declares a reference to Stringobject.
If RegExp has the flag G, the match () method performs a global retrieval and finds all matching regular expression strings in Stringobject. If no matching substring is found, NULL is returned. If one or more matching substrings are found, an array is returned. However, the contents of the array returned by the global match are very different from the former, and its array elements only hold all the matched regex strings in the Stringobject, not the substrings that match the groupings in the regular expression. And there is no index property or input property.
Note: Under Global retrieval mode G, match () does not provide information about the text that matches the grouping, nor does it declare the location of each matched substring. If you need these globally retrieved information, you can use Regexp.exec ().
The following example shows the use of the match function method in javascript:
function Matchdemo () {
var r, re; Declares a variable.
var s = "The rain in Spain falls mainly in the plain";
re =/ain/i; Creates a regular expression pattern.
r = S.match (re); Try to match the search string.
return (R); Return to the place where "Ain" first appeared.
}
This example shows the use of the match function method in JavaScript with the G flag set
function Matchdemo () {
var r, re; Declares a variable.
var s = "The rain in Spain falls mainly in the plain";
re =/ain/ig; Creates a regular expression pattern.
r = S.match (re); Try to match the search string.
return (R); The returned array contains all of the "Ain"
Four matches that appear.
}
Search () Method:
Usage: String object. Search (RegExp object)
Return value: The search method indicates whether there is a corresponding match. If a match is found, the search method returns an integer value indicating the offset from the beginning of the match distance string. If no match is found, 1 is returned.
Note: Only the position of the first substring that matches the regular expression find content is returned. So it doesn't make sense to use global search parameters.
Replace () Method:
Usage: string object. replace (RegExp Object | string, "substituted string")
Return value: If the global match g of the RegExp object is set, all that is satisfied will be replaced, otherwise only the first one is replaced. Returns the replaced string.
Note: A string can also be accepted in replace, but only the first string that satisfies the condition is replaced.
*************************
The exec () method differs from the match () method:
1 for non-grouped regular expressions
If the regular expression that executes the Exec method is not grouped (not a group is enclosed in parentheses), then if there is a match, he will return an array with only one element, the only element of which is the first string that matches the regular expression, or null if there is no match.
The following two alert functions pop up with the same message:
var str= "Cat,hat";
var p=/at/; No G attribute
Alert (p.exec (str))
Alert (Str.match (p))
Are all "at". exec is equivalent to match on this occasion.
But if the regular expression is a global match (G property), then the above code results are different:
var str= "Cat,hat";
var p=/at/g; Note the G property
Alert (p.exec (str))
Alert (Str.match (p))
respectively is
"At"
"At,at".
Because the regular expression exec that does not have a grouping always returns only the first match of the regular expression, and match returns the text of all matching regular expressions when the regular expression specifies the G property.
2 for regular expressions with grouping
exec if a match is found and contains a grouping, the returned array will contain more than one element, the first element being the string that matches the regular expression found, followed by the first, second, and subsequent elements of the match. A string that is grouped. (Reverse reference)
The following code will pop up "Cat2,at":
var str= "Cat2,hat8";
var p=/c (at) \d/;
Alert (p.exec (str))
The first element is the string "Cat2" that matches the/C (at) \d/, and the following element is the "at" of the matching grouping (at) in parentheses.
var sometext= "web2.0. net2.0";
var pattern=/(\w+) (\d) \. (\d)/g;
var outcome_exec=pattern.exec (Sometext);
var outcome_matc=sometext.match (pattern);
Analysis:
The value of Outcome_exec: the G attribute in pattern has no effect on the EXEC function, so exec matches the first string "web2.0" that can be matched, as the first element of its return array, and because the pattern contains three groupings ((\w+), (\d), (\d)), so the array will also contain three elements, followed by "Web", "2", "0", so the final result of the exec execution is: ["web2.0", "Web", "2", "0"]
OUTCOME_MATC value: Because pattern is globally matched, match matches all substrings that match the regular expression, so the value of the resulting array is OUTCOME_MATC ["web2.0", "net2.0"]. If the pattern does not have a G attribute, it will be the same as the outcome_exec result.
*************************
6 Common JS Regular expressions
JavaScript is commonly used when validating forms
"^-[0-9]*[1-9][0-9]*$"//Negative integer
/^[0-9]*[1-9][0-9]*$///Positive integer
/^[1-9]{1}[0-9]{0,1}$///1~99
"^-?\d+$"//Integer
"^\d+ (\.\d+)? $"//non-negative floating-point number (positive floating point + 0)
^ ([0-9]+\]. [0-9]*[1-9][0-9]*) | ([0-9]*[1-9][0-9]*\. [0-9]+) | ([0-9]*[1-9][0-9]*)) $ "//positive floating-point number
"^ ((-\d+ (\.\d+)?) | (0+ (\.0+)?)) $ "//non-positive floating-point number (negative floating-point number + 0)
^ (-([0-9]+\]. [0-9]*[1-9][0-9]*) | ([0-9]*[1-9][0-9]*\. [0-9]+) | ([0-9]*[1-9][0-9]*))) $ "//negative floating-point number
^ (-?\d+) (\.\d+)? $ "//floating-point number
"^[a-za-z]+$"//A string consisting of 26 English letters
"^[a-z]+$"//A string consisting of 26 uppercase letters in English
"^[a-z]+$"//String consisting of 26 English letters in lowercase
"^[a-za-z0-9]+$"//string consisting of a number and 26 English letters
"^\w+$"//A string consisting of numbers, 26 letters or underscores
^[a-za-z]+:\/\/(\\w+ (-\\w+) *) (\ \. ( \\w+ (-\\w+) *) * (\\?\\s*)? $//url
/check ordinary telephone, fax number: Can "+" start, in addition to the number, can contain "-", the country number.
var patrn=/^ ([+]{0,1} (\d) {1,2}[-]?)? ((\d) {3,4} ([-]? ( \d) {6,9})) $/;
YYYY/MM/DD Regular Expressions in hh:mm format
/^ ((((1[6-9]| [2-9]\d) \d{2}) \ (0?[ 13456789]|1[012]) \ (0?[ 1-9]| [12]\d|30)] | (((1[6-9]| [2-9]\d] \d{2}) \/0?2\/(0?[ 1-9]|1\D|2[0-8]) | (((1[6-9]| [2-9]\d] (0[48]|[ 2468][048]| [13579] [26]) | ((16| [2468] [048]| [3579] [26]) )) (\/0?2\/29\/)) \s (0\d{1}|1\d{1}|2[0-3]):([0-5]\d{1}) $/
Regular Expressions for YYYY/MM/DD
/^ ((((1[6-9]| [2-9]\d) \d{2}) \ (0?[ 13456789]|1[012]) \ (0?[ 1-9]| [12]\d|30)] | (((1[6-9]| [2-9]\d] \d{2}) \/0?2\/(0?[ 1-9]|1\D|2[0-8]) | (((1[6-9]| [2-9]\d] (0[48]|[ 2468][048]| [13579] [26]) | ((16| [2468] [048]| [3579] [26]) 00)) $/\/0?2\/29\/))
Regular Expressions for YYYY-MM-DD
/^(?:(?! 0000) [0-9]{4}-(?:(?: 0 [1-9]|1[0-2])-(?: 0 [1-9]|1[0-9]|2[0-8]) | (?: 0 [13-9]|1[0-2])-(?: 29|30) | (?: 0 [13578]|1[02])-31) | (?: [0-9]{2} (?: 0 [48]| [2468] [048]| [13579] [26]) | (?: 0 [48]| [2468] [048]| [13579] [26]) 00)-02-29) $/
Regular Expressions for Yyyy-mm-dd HH:mm:ss
/^(?:(?! 0000) [0-9]{4}-(?:(?: 0 [1-9]|1[0-2])-(?: 0 [1-9]|1[0-9]|2[0-8]) | (?: 0 [13-9]|1[0-2])-(?: 29|30) | (?: 0 [13578]|1[02])-31) | (?: [0-9]{2} (?: 0 [48]| [2468] [048]| [13579] [26]) | (?: 0 [48]| [2468] [048]| [13579] [26]) XX) -02-29) \s+ ([01][0-9]|2[0-3]): [0-5][0-9]:[0-5][0-9]$/
Year 0001-9999, Format Yyyy-mm-dd or yyyy-m-d, hyphen can be no or "-", "/", "." One of them.
/^(?:(?! 0000) [0-9]{4} ([-/.]?) (?:(?: 0? [1-9]|1[0-2]) \1 (?: 0?) [1-9]|1[0-9]|2[0-8]) | (?: 0?) [13-9]|1[0-2]) \1 (?: 29|30) | (?: 0?) [13578]|1[02]) \1 (?: 31)) | (?: [0-9]{2} (?: 0 [48]| [2468] [048]| [13579] [26]) | (?: 0 [48]| [2468] [048]| [13579] [26]) 00) ([-/.]?) 0?2\2 (?: 29)) $/
Time format HH
patternsdict.time_h=/^ (0\d{1}|1\d{1}|2[0-3]) $/;
Time Format hh:mm
patternsdict.time_hm=/^ (0\d{1}|1\d{1}|2[0-3]):([0-5]\d{1}) $/;
Time Format Hh:mm:ss
patternsdict.time_hms=/^ (0\d{1}|1\d{1}|2[0-3]): [0-5]\d{1}:([0-5]\d{1}] $/;
Match only Show,show1. SHOW9, strings like "showddd" can't be matched.
^show[1-9]?$
Regular expression IP address
var ipAddress = 20.26.43.111;
var re =/^ ([0-9]|[ 1-9]\D|1\D\D|2[0-4]\D|25[0-5]) \. ([0-9]| [1-9]\d|1\d\d|2[0-4]\d|25[0-5]) \. ([0-9]| [1-9]\d|1\d\d|2[0-4]\d|25[0-5]) \. ([0-9]| [1-9]\d|1\d\d|2[0-4]\d|25[0-5]) $/;
if (!re.test (ipAddress)) {
Alert ("IP address format is incorrect, please modify");
return false;
}
Regular expression Validation email format
function Testmail (mail,mess) {
Verifying email functions
var M=jquery.trim ($ ("#" +mail). attr (' value '));
var vm=new RegExp ("^[a-za-z0-9]{1,}@{1}[a-za-z0-9]{1,}\.{ 1}[a-za-z0-9]{3,}$ "," I "); Host name does not take _ or-
var vm=new RegExp ("^[a-za-z0-9]{1,}[email protected]{1}[a-za-z0-9]{1,}[a-za-z0-9_\-]* (\.{ 1}[a-za-z0-9]{2,}) +$ "," I ");//Host name with _ or-
if (Mail.match (VM)) {
$ ("#" +mail). Next (). Remove ();
}else{
$ ("#" +mail). Next (). Remove ();
$ ("#" +mail). After (" <span>" +mess+ "</span>");
}
}
The regular expression validates the URL format:
function Checkurl (TT) {
var url= "/http" +$ (TT). Val ();
alert (URL);
var reg=/^http:\/\/(\w+) ([_|.| \-|\/|\\]? (\w+)) *$/;
if (Url.match (reg)) {
Alert ("true");
return true;
}else{
Alert ("KKKK");
$ (this). The parent (). Find (": Last"). empty ();
$ (this). The parent (). Find (": Last"). Append ("Please fill in the correct site name!");
return false;
}
}
Regular Expressions validate Chinese characters (true if the characters entered contain Chinese characters)
function Checkhanzi (TT) {
var hanzi=$ (TT). Val ();
Match Chinese characters
var reg=/[^\x00-\xff]/
if (Hanzi.match (reg)) {
Alert ("true");
return true;
}else{
Alert ("KKKK");
$ (this). The parent (). Find (": Last"). empty ();
$ (this). The parent (). Find (": Last"). Append ("Please fill in the correct site name!");
return false;
}
}
Cannot have special characters in regular expression match keywords
var reg=/^[^@#!%$¥&^*+-,.? \\s=\|] +$/;
if (Url.match (reg)) {
Alert ("does not contain special characters such as spaces");
return true;
}else{
Alert ("KKKK");
$ (this). The parent (). Find (": Last"). empty ();
$ (this). The parent (). Find (": Last"). Append ("contains special characters!");
return false;
}
}
The regular expression input keyword must be comma-delimited only Chinese alphanumeric:
var reg=/^ ([\w\d\u4e00-\u9fa5],?) +$/;
if (Url.match (reg)) {
Alert ("true");
return true;
}else{
Alert ("KKKK");
$ (this). The parent (). Find (": Last"). empty ();
$ (this). The parent (). Find (": Last"). Append ("Please separate with commas!");
return false;
}
}
Only numbers can be entered
function Onlynumber (e) {
E = Window.event | | E
var k = E.keycode | | E.which;
if ((k==110) | | | (k==190) | | (k==46) | | (k==8) | | (k>=48 && k<=57) | | (k>=96 && k<=105) | | (k>=37 && k<=40) | | (k==189)) {
}else if ((k==190) | | (k==110)) {
if (window.event)
Window.event.returnValue = false;
Else
E.preventdefault ();//for Firefox
}else{
if (window.event)
Window.event.returnValue = false;
Else
E.preventdefault ();//for Firefox
}
}
<input type= "text" onkeydown= "Onlynumber (event)"/>
/**
* Rounding
* V Value
* E reserve the number of decimal places
*/
function Rounddigits (v,e) {
var t=1;
for (; e>0;t*=10,e--);
for (; e<0;t/=10,e++);
Return Math.Round (v*t)/t;
}
Alert (Rounddigits (12.5656565,2));
23javascript Regular Expressions