JS Regular Expression--grammar detailed explanation (ii) __ Regular expression

Source: Internet
Author: User
Tags first string joins

1, defining regular Expressions

1 There are two forms of defining regular expressions, one is the normal way, the other is the constructor method.
2 normal way: var reg=/expression/Additional parameter
Expression: A string that represents a rule in which special characters can be used to represent special rules, which are described later in detail.
Additional parameters: to extend the meaning of an expression, there are currently three main parameters:
G: Delegates can make global matches.
I: represents case-insensitive matching.
M: Represents a multiple-row match.
The above three parameters, can be any combination, representing the compound meaning, of course, can not add parameters.
Example:
var reg=/a*b/;
var reg=/abc+f/g;
3 constructor mode: var reg=new RegExp ("expression", "additional parameters");
Where the meaning of expression and additional parameters is the same as in the definition above.
Example:
var reg=new RegExp ("a*b");
var reg=new RegExp ("Abc+f", "G");
4 the difference between the normal way and the constructive function
The expression in the normal way must be a constant string, and the expression in the constructor can be a constant string or a JS variable, such as an expression parameter based on the user's input, and so on:
var reg=new RegExp (Document.forms[0].exprfiled.value, "G");

2, expression pattern

1 expression pattern, refers to the expression and style of expression, that is, var reg=/expression/Additional parameters of the "expression" how to describe.
2 from the specification, the expression pattern is divided into simple mode and compound mode.
3) Simple mode: Refers to the pattern expressed by the combination of ordinary characters, such as
var reg=/abc0d/;
The visible simple mode can only represent a specific match.
4) Compound pattern: refers to the pattern that contains wildcard characters to express, for example:
var reg=/a+b?/w/;
The +,?, and/w are all wildcard characters that represent a particular meaning. So the composite pattern can express more abstract logic.
Here we focus on the meaning and use of the wildcard characters in the composite pattern.
5 Explanation of special characters in compound mode:

1>/: Used as an escape character in many programming languages, generally
/symbol followed by the normal character C, then the/C means a special meaning, for example, N would represent the character N, but/n would represent a newline.
/symbol followed by a special character C, then/C on behalf of the normal character, for example,/generally as an escape character, but//the normal character of the adjustment table/.
JavaScript's regular expressions are used in the same way as above, except for different programming languages, and special character sheets may not be the same.

2>^: Matches the starting end of the input string, if it is a multiline match, that is, if the additional parameter of the expression contains M, it also matches after a newline character.
Example:/^b/matches the first B in "Bab Bc"
Example 2:/^B/GM Matching
"Badd B
Cdaf
B DSFB "
The first line in the first B, and the first B in the third row

3>$: Matches the end of the input word Fu Yi, if it is a multiline match, that is, the additional parameter of the expression contains M, and also matches before a newline character.
Contrary to the usage of ^.
Example:/t$/matches T in "bat", but does not match T in "hate"
Example 2:/t$/Matching
"Tag at
Bat
The last T and the second line of T in the first line of the

4>*: Matches the previous character 0 or more times.
Example:/ab*/matches "abbbb" in "DDDABBBBC" and also matches "a" in "Ddda"

5>+: Matches the previous character 1 or more times.
Example:/ab+/matches "abbbb" in "DDDABBBBC", but does not match "Ddda"
Similar to the following {1,} (prototype: {n,}) usage

The usage of 6> is special, it is used to match the previous character 0 times or 1 times, but it has two other special usages:
If immediately after *, +,?, and {}, the minimum number of matches for the original match is indicated, for example:
/ba*/would have matched "baaaa" in "bbbaaaa", but/ba*?/would match "B" in "BBBAAAA" (because * represents 0 or more matches, and plus should indicate a minimum number of matches, that is, 0 matches).
Similarly:/ba+?/matches "ba" in "Baaaa".
As a grammatical structure symbol, used in the forward assertion, that is, the X (? =y) and X (?!) to be mentioned later. =y)

7>.: "." In the decimal point. Number, matching any single character, except for line breaks.
What are the total characters in the standard? Please refer to: Character set
For example:/a.b/matches "ACB" in "Acbaa", but does not match "abbb".

8> (x): Represents a match X (not specifically a character x or a character, x represents a string), and the match is remembered, in the syntax () is called "Capturing parentheses", that is, the parentheses used for the capture.
The match is remembered because in the function provided by the expression, some functions return an array that holds all the matched strings, such as the Exec () function.
Also note that x in () is remembered as a match for X.
Example 1:
var regx=/a (b);
var rs=regx.exec ("abcddd");
It can be seen from above that/a (b) matches "abc" in "ABCDDD" because B is also recorded, so the number returned by RS is:
{Abc,b}
Example 2:
var regx=/a (b);
var rs=regx.exec ("acbcddd");
RS returns null because/A (b) does not match "acbcddd", so b in () is not recorded (although the string contains B)

9> (?: x): Matches x, but does not remember X, in this format () is called "non-capturing parentheses", that is, a parenthesis that is not captured.
Example:
var regx=/a (?: b);
var rs=regx.exec ("abcddd");
As you can see from above,/a (?: b) matches "abc" in "Abcddd" because (?:) The reason, B will not be recorded, so the number of Rs returned by the content is:
{ABC}

10>x (? =y): Matches X, only if followed by Y. If match is met, only X will be remembered and Y will not be remembered.
Example:
var regx=/user (? =name)/;
var rs=regx.exec ("The username is Mary");
Result: Match succeeded, and Rs value is {user}

11>x (?!) Y): Match x, only if the back is not followed by Y. If match is met, only X will be remembered and Y will not be remembered.
Example:
var regx=/user (?!) Name)/;
var rs=regx.exec ("The user name is Mary");
Result: Match succeeded, and Rs value is {user}
Example 2:
var regx=//d+ (?!) /.) /;
var rs=regx.exec ("54.235");
Results: The match result, RS value is {5}, does not match 54 is because 54 follows "." Number, of course 235 also match, but due to the Exec method behavior, 235 will not be returned

12>x|y: Matches x or Y. Note that if both X and Y match, then only X is remembered.
Example:
var regx=/beijing|shanghai/;
var rs=regx.exec ("I Love Beijing and Shanghai");
Result: Match succeeded, Rs value is {Beijing}, although Shanghai also match, but will not be remembered.

13>{n}: N occurrences of the preceding character.
n must be a non-negative number, and of course a negative or decimal number will not report a grammatical error.
Example:
var regx=/ab{2}c/;
var rs=regx.exec ("Abbcd");
Result: The match was successful and the RS value was: {ABBC}.

14>{n: Matches at least n occurrences of the previous character.
Example:
var regx=/ab{2,}c/;
var rs=regx.exec ("Abbcdabbbc");
Result: The match was successful and the RS value was: {ABBC}. Note Why ABBBC also meets the criteria why it is not remembered, which is related to the behavior of the Exec method and is explained in a unified manner.

15>{N,M}: The occurrence of at least n times the most m times of the preceding character.
As long as N and M are numbers, and m>=n does not report grammatical errors.
Example:
var regx=/ab{2,5}c/;
var rs=regx.exec ("Abbbcd");
Result: The match was successful and the RS value was: {ABBBC}.
Example 2:
var regx=/ab{2,2}c/;
var rs=regx.exec ("Abbcd");
Result: The match was successful and the RS value was: {ABBC}.
Example 3:
var regx=/ab (2,5)/;
var rs=regx.exec ("abbbbbbbbbb");
Result: The match was successful and the RS value was: {ABBBBB}, which shows that if the previous character appears more than m times, only the M times are matched. Other than that:
var regx=/ab (2,5);
var rs=regx.exec ("Abbbbbbbbbbc");
Result: Match failed, the value of RS is: null, why the match fails because B is more than 5 then B (2,5) matches the first 5 B, and the expression/ab (2,5) is followed by C, but after 5 B or B in the string, the error occurs.

16>[XYZ]:XYZ represents a string that represents a character in the match [] and form [XYZ] is equivalent to [x-z].
Example:
var regx=/a[bc]d/;
var rs=regx.exec ("Abddgg");
Result: Match succeeded, RS value is: {abd}
Example 2:
var regx=/a[bc]d/;
var rs=regx.exec ("ABCD");
Result: The match failed with the value of Rs: null, which failed because [BC] represents one of the matches in B or C, but does not match at the same time.

17>[^XYZ]: The pattern represents a character that matches a non [], and the form [^XYZ] is equivalent to [^x-z].
Example:
var regx=/a[^bc]d/;
var rs=regx.exec ("Afddgg");
Result: Match succeeded, RS value is: {afd}
Example 2:
var regx=/a[^bc]d/;
var rs=regx.exec ("Abd");
Result: The match failed, and the value of RS is:.

18>[/b]: matches the backspace bar.

19>/b: Matches a word's boundary character, such as a space and a newline character, and so on, of course, when matching a newline character, the expression should attach the parameter m.
Example:
var regx=//bc./;
var rs=regx.exec ("Beijing is a beautiful city");
Results: The match was successful, the value of Rs is: {CI}, note that the space at the front of C will not match to the result, that is, {CI} is incorrect.

20>/b: Represents a non word boundary.
Example:
var regx=//bi./;
var rs=regx.exec ("Beijing is a beautiful city");
Result: The match was successful and the RS value was: {IJ}, which matched the IJ in Beijing.

21>/CX, match a control character. For example,/cm matches a control-m or
The carriage return character. The value of x must be one-a-Z or a-Z. Otherwise, c is treated as a
The literal ' C ' character. (Practical examples also need to be added)

21>/D: Matches a numeric character, equivalent to [0-9].
Example:
var regx=/user/d/;
var rs=regx.exec ("user1");
Result: Match succeeded, RS value is: {user1}

22>/D: Matches a non-numeric character, equivalent to [^0-9].
Example:
var regx=/user/d/;
var rs=regx.exec ("UserA");
Result: Match succeeded, RS value is: {UserA}

23>/F: Matches a page feed character.

24>/n: Matches a newline character. Because it is a newline character, you include the M parameter in the expression.
Example:
var regx=/a/nbc/m;
var str= "A
BC ";
var rs=regx.exec (str);
Results: Match succeeded, the value of Rs is: {}, if the expression is/a/n/rbc/, then will not be matched, so in the general editor a "enter" key represents "carriage return line" instead of "newline carriage return", at least in the textarea domain.
25>/R: Match a return character

26>/S: Match a spaces, equal to [/f/n/r/t/v/u00a0/u2028/u2029].
Example:
var regx=//si/;
var rs=regx.exec ("Beijing is a city");
Result: Match succeeded, RS value is: {i}

27>/S: Match a non-spaces, equivalent to [^/f/n/r/t/v/u00a0/u2028/u2029].
Example:
var regx=//si/;
var rs=regx.exec ("Beijing is a city");
Result: Match succeeded, RS value is: {ei}

28>/T: Matching a tab
Example:
var regx=/a/tb/;
var rs=regx.exec ("a BC");
Result: Match succeeded, the value of Rs is: {a BC}

29>/v: Matching a vertical tab

30>/W: Matches a number, _, or alphabetical table character, that is, [a-za-z0-9_].
Example:
var regx=//w/;
var rs=regx.exec ("$25.23");
Result: Match succeeded, RS value is: {2}

31>/W: Matches a non-numeric, _, or alphabetical table character, that is, [^a-za-z0-9_].
Example:
var regx=//w/;
var rs=regx.exec ("$25.23");
Result: Match succeeded, RS value is: {$}

32>/n: note is not/n, where n is a positive integer that matches the characters in Nth ().
Example:
var regx=/user ([,-]) group/1role/;
var rs=regx.exec ("User-group-role");
Results: The match succeeded, the value of Rs is: {user-group-role,-}, the same match to User,group,role is also successful, but like User-group,role and so on is wrong.

33>/0: Matches a nul character.

34>/XHH: Matches a character expressed by a two-bit 16-digit number.

35>/uhhhh: Matches a character expressed by a four-bit 16-digit number.


3, expression operation

1 expression operation, here refers to the method associated with the expression, we will introduce six methods.
2 An Expression object (RegExp) method:

1>exec (str), returns the first string in str that matches the expression, and behaves as an array, but if the expression contains parentheses in the capture, the returned array may also contain a matching string in (), for example:
var regx=//d+/;
var rs=regx.exec ("3432ddf53");
The RS value returned is: {3432}
var regx2=new RegExp ("AB (/d+) c");
var rs2=regx2.exec ("Ab234c44");
The RS value returned is: {ab234c,234}
In addition, if there are multiple appropriate matches, the first exec returns a first match, and then the second third match is returned by continuing exec. For example:
var regx=/user/d/g;
var rs=regx.exec ("Ddduser1dsfuser2dd");
var rs1=regx.exec ("Ddduser1dsfuser2dd");
The value of Rs is {user1},rs {rs2}, of course, note that the G parameter in REGX is required, otherwise the first match will be returned regardless of how many exec executions are performed. The following are related to the interpretation of this imagination.

2>test (str), which determines whether the string str matches an expression and returns a Boolean value. For example:
var regx=/user/d+/g;
var flag=regx.test ("User12dd");
The value of flag is true.

3) String Object method

1>match (expr), returns an array of strings that match expr, returns the first match if no argument g is added, and joins the parameter G to return all matches
Example:
var regx=/user/d/g;
var str= "user13userddduser345";
var rs=str.match (REGX);
The value of Rs is: {User1,user3}

2>search (expr), returns the first matching index value in the string that matches expr.
Example:
var regx=/user/d/g;
var str= "user13userddduser345";
var rs=str.search (REGX);
The value of Rs is: 0

3>replace (EXPR,STR) replaces the part of the string that matches expr with Str. Alternatively, in the Replace method, str can contain a variable symbol $, formatted as $n, representing the matched string of the remembered nth in the match (note that parentheses can be matched by memory).
Example:
var regx=/user/d/g;
var str= "user13userddduser345";
var rs=str.replace (regx, "00");
The value of RS is: 003userddd0045
Example 2:
var regx=/u (SE) r/d/g;
var str= "user13userddduser345";
var rs=str.replace (REGX, "$");
The value of RS is: Se3userdddse45
A special note for the Replace (EXPR,STR) method is that if expr is an expression object, it will be replaced globally (at which point the expression must be appended with the parameter G, or just replace the first match), and if expr is a string object, it will only replace the first matching part. For example:
var regx= "User"
var str= "user13userddduser345";
var rs=str.replace (regx, "00");
The value of RS is: 0013userddduser345

4>split (expr), which splits the string to match the part of expr, returns an array, and whether the expression attaches a parameter G has nothing to do with it, and the result is the same.
Example:
var regx=/user/d/g;
var str= "user13userddduser345";
var rs=str.split (REGX);
The value of Rs is: {3userddd,45}

4, expression-related properties

1 An expression-related property, which refers to an attribute associated with an expression, as in the following form:
var regx=/myexpr/;
var rs=regx.exec (str);
Of these, there are two attributes associated with the expression's own regx, and three of the attributes associated with the expression-matching result RS, described below.
2) and two properties related to the expression itself:

1>lastindex, returns the position where the next match is to be started, noting that a global match (with the G parameter in the expression) is lastindex to return the next matching value, otherwise the value will always return the first next matching position, for example:
var regx=/user/d/;
var rs=regx.exec ("Sdsfuser1dfsfuser2");
var Lastindex1=regx.lastindex;
Rs=regx.exec ("Sdsfuser1dfsfuser2");
var Lastindex2=regx.lastindex;
Rs=regx.exec ("Sdsfuser1dfsfuser2");
var Lastindex3=regx.lastindex;
The top lastIndex1 is 9, the second lastIndex2 is 9, and the third is 9; if regx=/user/d/g, the first is 9, the second is 18, and the third is 0.

2>source, returns the expression string itself. For example:
var regx=/user/d/;
var rs=regx.exec ("Sdsfuser1dfsfuser2");
var Source=regx.source;
The value of source is user/d
3) and the matching results related to the three properties:

1>index, returns the location of the current match. For example:
var regx=/user/d/;
var rs=regx.exec ("Sdsfuser1dfsfuser2");
var Index1=rs.index;
Rs=regx.exec ("Sdsfuser1dfsfuser2");
var Index2=rs.index;
Rs=regx.exec ("Sdsfuser1dfsfuser2");
var Index3=rs.index;
Index1 is 4,index2 to 4,index3 is 4, if the expression adds parameter G, then index1 to 4,INDEX2 for the error (index is empty or not object).

2>input, used for matching strings. For example:
var regx=/user/d/;
var rs=regx.exec ("Sdsfuser1dfsfuser2");
var input=rs.input;
The value of input is sdsfuser1dfsfuser2.

3>[0] Returns the first matching value in the matching result, which may return a multivalued number for match, except [0], [1], [2], and so on. For example:
var regx=/user/d/;
var rs=regx.exec ("Sdsfuser1dfsfuser2");
var value1=rs[0];
Rs=regx.exec ("Sdsfuser1dfsfuser2");
var value2=rs[0];
The value of the value1 value is User1,value2 is User2

5, Practical application

1) Practical application of a
Description: There is a form with a "username" input field
Requirements: Chinese characters, and not less than 2 characters, not more than 4 Chinese characters.
Realize:
<script>
function Checkform (obj) {
var Username=obj.username.value;
var regx=/^[/u4e00-/u9fa5]{2,4}$/g
if (!regx.test (username)) {
Alert ("Invalid username!");
return false;
}
return true;
}
</script>
<form name= "MyForm" onsubmit= "return Checkform (This)" >
<input type= "text" name= "username"/>
<input type= "Submit" vlaue= "Submit"/>
</form>
2) Practical Application Two
Description: Given a string containing an HTML tag, the HTML tag is required to be removed.
Realize:
<script>
function Toplaintext (HTMLSTR) {
var regx=/<[^>]*>|<//[^>]*>/gm;
var str=htmlstr.replace (REGX, "");
return str;
}
</script>
<form name= "MyForm" >
<textarea id= "Htmlinput" ></textarea>
<input type= "button" value= "Submit" onclick= toplaintext (' document.getElementById '). Value "Htmlinput
</form>

Third, summary

1,javascript Regular expression, I think in the general programmer, the user should not be many, because we are dealing with the page is generally not very complex, and complex logic generally we are in the background processing completed. But the trend has been reversed, rich clients have been accepted by more and more people, and JavaScript is the key technology, for complex client logic, the role of regular expression is also critical, and it is a JavaScript master must have to master one of the important techniques.

2, in order to make it easier for you to have a more comprehensive and profound understanding of the content mentioned above, I will summarize some of the key points in front and the easily confused place again, this part is very important.
Summary 1: Usage of appendix Parameter G
The expression plus the parameter G indicates that a global match can be made, noting the meaning of "can" here. We describe in detail:
1 for the Exec method of the expression object, do not join G, return only the first match, no matter how many times the execution is so, if G is added, then the first execution returns the first match, then executes the second match, and so on. For example
var regx=/user/d/;
var str= "USER18DSDFUSER2DSFSD";
var rs=regx.exec (str);//This time the value of Rs is {user1}
var rs2=regx.exec (str);//At this time the value of Rs is still {user1}
If regx=/user/d/g, then the value of RS is {User1},rs2 value is {user2}
This example illustrates: for the Exec method, the expression adds G, not that the Exec method can return all matches, but that after G, I can get all the matches in some way, and the "way" for exec is to execute this method sequentially.
2 for the test method of an Expression object, adding G is no different than adding g.
3 for the match method of a string object, do not join G, just return the first match, the match method always returns the first match, joins G, returns all matches at a time (note that this is different from the Exec method of an expression object, for Exec, The expression does not return all matches at once, even if you add G. For example:
var regx=/user/d/;
var str= "USER1SDFSFFUSER2DFSDF";
var rs=str.match (REGX);//This time the value of Rs is {user1}
var rs2=str.match (REGX);//At this time the value of Rs is still {user1}
If regx=/user/d/g, then the value of RS is {User1,user2},rs2 value is also {User1,user2}
4 for the Replace method of a String object, the expression does not join G, only the first match is substituted, and if G is added, all matches are replaced. (The beginning of the three test test can be very good to illustrate this point)
5 for the string object's Split method, plus G and no g are the same, namely:
var sep=/user/d/;
var array= "USER1DFSFUSER2DFSF". Split (Sep);
The value of the array is {DFSF, DFSF}
At this point sep=/user/d/g, the return value is the same.
6 for a String object search method, plus g is the same.
Summary 2: Use of additional parameter M
Additional parameter m, which indicates that multiple rows can be matched, but this only works when using the ^ and $ schemas, and in other modes, adding no m can do multiple rows matching (in fact, the string is also a normal string), we illustrate this point
1 Examples of using ^
var regx=/^b./g;
var str= "Bd76 Dfsdf
Sdfsdfs dffs
B76DSF SDFSDF ";
var rs=str.match (REGX);
When G is added and no g is added, only the first match {BD} is returned, and if regx=/^b./gm, all matches {bd,b7} are returned, noting that if regx=/^b./m, only the first match is returned. Therefore, adding m indicates that multiple rows can be matched, and G indicates that global matching can be made, and that a combination of multiple lines of global matching is possible
2 use examples of other patterns, such as
var regx=/user/d/;
var str= "Sdfsfsdfsdf
Sdfsuser3 dffs
B76DSF User6 ";
var rs=str.match (REGX);
At this time, no parameter g, then return {User3}, add parameter G return {USER3,USER6}, plus does not add m does not affect this.
3 So for M we have to be clear about its use, and remember that it works only on the ^ and $ patterns, and in both modes, the action of M is: if you don't join M, you can only match in the first row, and if you add M you can match all the rows. Let's look at one more example of ^
var regx=/^b./;
var str= "Ret76 Dfsdf
Bjfsdfs dffs
B76DSF SDFSDF ";
var rs=str.match (REGX);
At this point, the value of RS is null, and if the value added to G,rs is still null, if M is added, the value of Rs is {BJ} (that is, the match is not found in the first row, because there is a parameter m, you can continue to go to the following line to find if there is a match) and if both M and G are added, then return {BJ,B7} (add only m without G description, you can go to multiple lines to match, but find a match to return, add G to indicate that all the matches in multiple rows back, of course, for the match method is so, for exec, you need to perform several times in order to return)
Summary 3: In the HTML textarea input field, press a enter key, the corresponding control character is "/r/n", namely "carriage return line", instead of "/n/r", that is, "line carriage return", we look at a previous example we have cited:
var regx=/a/r/nbc/;
var str= "A
BC ";
var rs=regx.exec (str);
Results: Match succeeded, the value of Rs is: {}, if the expression is/a/n/rbc/, then will not be matched, so in the general editor a "enter" key represents "carriage return line" instead of "newline carriage return", at least in the textarea domain.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.