Javascript advanced programming (Third edition) Study Notes (1) Regular Expressions

Source: Internet
Author: User

1. Create a regular expression
Method 1: note that the regular expression here cannot use single or double quotation marks, as shown below:
Var pattern1 =/[abc]/I; // match the first "a" or "B" or "c", case insensitive
Method 2: Create using RegExp constructor. This constructor passes in two parameters, both of which are strings. Therefore, pay special attention to the conversion of the "\" symbol, double escape is required for all metacharacters (which are described below), as shown below:
Copy codeThe Code is as follows:
Var patt1 = new RegExp ("[abc]", "gi"); // This is equivalent to var patt1 =/[abc]/gi;
Alert ("[abc]". match (patt1); // return
Var patt2 = new RegExp ("\ [abc \]", "gi"); // This is equivalent to var patt2 =/\ [abc \]/gi; in the quotation marks, "\" must be transferred "\\"
Alert ("[abc]". match (patt2); // return [abc]

Let's talk about two questions:
A. if the regular expression is a dynamic combination of a string and a variable, how can I create a regular expression in the first way?
Obviously, there is certainly no problem when using the second method to create a regular expression, because the first input parameter itself is a string. Use the first method to create an image. The eval () function is required here, as shown below:
Copy codeThe Code is as follows:
Var str = "abc"; // This may be a dynamic variable.
Var patt1 = eval ("// \ [" + str + "\]/"); // equivalent to var patt1 =/\ [abc \]/;
Alert ("[abc]". match (patt1); // return [abc]

B. What are the differences between the above two creation methods?
In the javascript advanced programming (Third edition), the difference is whether to share the regular expressions created in the first mode, and the second mode without sharing. The results are a little different from those found by the landlord. The two creation methods have the same results (true). You can test them on your own. So here, there is no difference between the two creation methods.
Copy codeThe Code is as follows:
Var re = null, I;
For (I = 0; I <3; I ++ ){
Re =/cat/g; // book: The instance property will not be reset. the following result should be true, false, true.
Alert (re. test ("catasdfdfdf "));
}
For (I = 0; I <3; I ++ ){
Re = new RegExp ("cat", "g"); // book: The instance property is reset. the following result is true.
Alert (re. test ("catasdfdfdf "));
}

Here, by the way, there are three marks of the matching pattern: g, I, m
G: After g is set, the mode is applied to all strings. If the first matching item is found, it will not be stopped immediately. The mode lastIndex will be moved to the next position, when this mode is applied for the next time, the lastIndex matches again. If the match ends, the lastIndex is reset to 0;
I: This is easy to understand and case-insensitive;
M: indicates the multiline mode, that is, when the end of a line of text is reached, the next line will be searched and matched;
2. metacharacters: ([{^ $ | )? * +.]}
These characters have one or more special purposes in regular expressions. Therefore, if you want to match these characters, you must escape them. For example:
Copy codeThe Code is as follows:
Var pattern1 =/\ [abc \]/I; // match the first "[abc]", case insensitive
Var pattern2 =/[abc]/I; // match the first "a" or "B" or "c", case insensitive

3. RegExp instance attributes
Global, ignoreCase, multiline, lastIndex, and source attributes are of little use. lastIndex can be used for debugging. A simple example
Copy codeThe Code is as follows:
Var patt1 =/cat/g;
Patt1.test ("catasdfdfdf ");
Dw (patt1.global); // whether g is set, global mode // true
Dw (patt1.ignoreCase); // whether I is set, Case Insensitive // false
Dw (patt1.multiline); // whether m is set, multi-row matching, to the end of a row, can continue to match the next row // false
Dw (patt1.lastIndex); // indicates the start position of the next matching item to start searching. The first time is 0 // 3
Dw (patt1.source); // returns the regular expression literal value // cat

4. Range collection class: [] ^ |
[Abc], indicating any character in a, B, or c
[A-z], [A-Z], [0-9], indicating lowercase letters, uppercase letters, numbers 0 to 9
[^ A-z], [^ A-Z], [^ 0-9], indicating non-lowercase letters, non-capital letters, not 0 to 9 digits
[Abc | def] indicates any one of abc and def.
Copy codeThe Code is as follows:
Alert (/[abc]/. test ("a"); // true
Alert (/[abc]/. test ("gg"); // false
Alert (/[^ abc]/. test ("a"); // false
Alert (/[^ abc]/. test ("gg"); // true
Alert (/[a-z]/. test ("A"); // false
Alert (/[A-Z]/. test ("A"); // true
Alert (/[abc | def]/. test ("def"); // true

5. quantifiers :? * + {M} {m, n} {m ,}
? 0 or 1 time, placed behind the quantifiers, indicating non-Greedy product matching, which will be explained later
* 0 or multiple times
+ 1 or multiple times
{M} m times
{M, n} At least m times, at most n times
{M,} At least m times
Copy codeThe Code is as follows:
Alert (/? /. Test ("a"); // true
Alert (/? /. Test ("B"); // true can appear 0 times
Alert (/a */. test ("a"); // true
Alert (/a */. test ("B"); // true can appear 0 times
Alert (/a +/. test ("a"); // true
Alert (/a +/. test ("B"); // false
Alert (/a {3}/. test ("aaaaa"); // true
Alert (/a {3 }/. test ("bbbbb"); // false some articles on the Internet will be displayed. Here it should be 0 or 3 times. Here, the landlord has tested several browsers, but not 0 times.
Alert (/a {3, 5}/. test ("aaaaa"); // true
Alert (/a {3, 5}/. test ("bbbbbb"); // false
Alert (/a {3,}/. test ("aaaaa"); // true
Alert (/a {3,}/. test ("bbbbbbbb"); // false

6. Boundary class: ^ $ \ B \ B
^ Indicates the sign starting with, note, cannot followed by left brackets, such as [^ A-Z], this indicates not
$ Indicates the ending sign
\ B: the boundary sign of a word. In this way, it indicates something that cannot be seen between words. One side is a word character, and the other side is a non-word character (including various punctuation marks, blank characters, or Chinese characters)
\ B Non-word boundary mark
Copy codeThe Code is as follows:
Alert (/^ $/. test (""); // only one resident is true.
Alert (/^ $/. test (""); // only one false
Alert (// B/. test ("a"); // true: Why is this different from the following ??
Alert (// B Ah/. test ("-ah"); // false
Alert (// \ B Ah/. test ("a ah"); // false
Alert (// B Ah/. test ("-ah"); // true

Alert (/\ B Ah /. test ("a ah"); in, there is a \ B between "a" and "ah", because the left side of \ B is "a" and the right side is "ah ", so it can be matched, true;
On the contrary: alert (/\ B Ah /. test ("-ah"); medium, there is a \ B on the left of "ah", but the left of \ B is "-", not a word character, so, cannot match, false
7. predefined class: \ d \ D \ s \ S \ w \ W.
\ D indicates 0-9 numeric characters, equivalent to [0-9]
\ D indicates a number character other than 0-9, which is equivalent to [^ 0-9]
Copy codeThe Code is as follows:
Alert (// \ d/. test ("1"); // true
Alert (// D/. test ("1"); // false

\ S blank characters, equivalent to [\ n \ r \ f \ t \ x0B]. Note that spaces are also counted.
\ S is not a blank character, which is equivalent to [^ \ n \ r \ f \ t \ x0B]
Copy codeThe Code is as follows:
Alert (// s/. test (""); // true, space is also calculated
Alert (/\ S/. test ("\ n \ r \ f \ t \ x0B"); // false
Alert (/\ S/. test ("\ n \ r \ f \ t \ x0B \"); // true

\ W word character, equivalent to [a-zA-Z0-9 _]
\ W non-word characters, equivalent to [^ a-zA-Z0-9 _]
Copy codeThe Code is as follows:
Alert (// \ w/. test ("afdas"); // true
Alert (// W/. test ("afdas"); // false

(Point). It indicates any unexpected characters except \ n and \ r, which is equivalent to [^ \ n \ r]
Copy codeThe Code is as follows:
Alert (//./. test ("\ n \ r"); // false. Only these two values can be matched.
Alert (//./. test (""); // true

8. RegExp instance method: exec () test () match ()
Exec (): returns the array of the first matching item information. If no matching is successful, null is returned. Usage: pattern.exe c (str); note that there are no differences between "g ".
Copy codeThe Code is as follows:
Var re1 =/([a-z] *) bbb/; // greedy
Document. write (re1.test ("abbbaabbb1234") + "<br/>"); // true
Document.write(re1.exe c ("abbbaabbb1234") + "<br/>"); // abbbaabbb, abbbaa. The greedy match matches abbaabbb and $1 is abbbaa. Therefore, abbbaabbb, abbbaa
Var re1 =/([a-z] *) bbb/g; // greedy
Document. write (re1.test ("abbbaabbb1234") + "<br/>"); // true
Document.write(re1.exe c ("abbbaabbb1234") + "<br/>"); // null // because the g ID is set and greedy, the above test has greedy match to abbbaabbb, followed by only 1234. Therefore, if no match is found, null is returned.

Test (): If you only need to know whether to match or not, and you do not need to know what text to match, it is convenient to use it. Usage: pattern. test (str );
Copy codeThe Code is as follows:
Var re1 =/([a-z] *) bbb/; // greedy
Document. write (re1.test ("abbbaabbb1234") + "<br/>"); // true
Document. write (re1.test ("abbbaabbb1234") + "<br/>"); // true
Document. write (re1.test ("abbbaabbb1234") + "<br/>"); // true
Var re1 =/([a-z] *) bbb/g; // greedy
Document. write (re1.test ("abbbaabbb1234") + "<br/>"); // true
Document. write (re1.test ("abbbaabbb1234") + "<br/>"); // false to understand why this is false, because g is set, the match starts from 1.
Document. write (re1.test ("abbbaabbb1234") + "<br/>"); // true

Match (): this function is special. When g is set and g is not set, the expression form is completely different. If g is not set, the expression form is the same as exec, if g is set, a set of all matched values is returned. Usage: str. match (pattern)
Copy codeThe Code is as follows:
Var re1 =/([a-z] *) bbb/; // greedy
Document. write ("abbbaabbb1234 ". match (re1) + "<br/>"); // abbbaabbb, abbbaa the abbbaabbb here is the matching string, and abbbaa is the string matching the first parentheses
Var re1 =/([a-z] *) bbb/g; // greedy
Document. write ("abbbaabbb1234". match (re1) + "<br/>"); // abbbaabbb, after g is set, match () returns all matched values

Finally, exec () and g match () are not set. The first element of the returned array is the matching string. If the pattern contains parentheses, then the second element is the Matching content of the first bracket, and so on. The third and fourth elements are... For example:
Copy codeThe Code is as follows:
Var re1 =/(a (B (c) d /;
Var str = "abcdd ";
Var matches = str. match (re1 );
Alert (matches [0]); // abcd // the entire string
Alert (matches [1]); // abc // here is the first parentheses
Alert (matches [2]); // bc // here is the second parentheses
Alert (matches [3]); // c // here is the third parentheses

9. Matching between greedy and non-greedy?
Greedy match: After matching, continue matching until the end of the string, and then select the longest. For example, for the string "aaaaaab", to match/a +/, it will be "aaaaaa" rather than "".
Non-Greedy match: Stop immediately after matching. For example, for the string "aaaaaab", match/a +? /, Then the match will be "a", rather than "aaaaaa ". The usage is to add "? ".
Copy codeThe Code is as follows:
Var re1 =/a + /;
Var str = "aaaaaaa ";
Alert (str. match (re1); // aaaaaaa
Var re1 =/a +? /;
Var str = "aaaaaaa ";
Alert (str. match (re1); //

Here is a comprehensive example: the difference between greedy and non-greedy, global g, exec, and match ???
Copy codeThe Code is as follows:
Var re1 =/([a-z] *) bbb/; // greedy
Var re2 =/([a-z] *?) Bbb/; // non-greedy
Document. write (re1.test ("abbbaabbb1234") + "<br/>"); // true
Document.write(re1.exe c ("abbbaabbb1234") + "<br/>"); // abbbaabbb, abbbaa. The greedy match matches abbaabbb and $1 is abbbaa. Therefore, abbbaabbb, abbbaa
Document. write ("abbbaabbb1234 ". match (re1) + "Document. write (re2.test ("abbbaabbb1234") + "<br/>"); // true
Document.write(re2.exe c ("abbbaabbb1234") + "<br/>"); // abbb, a, where abbb is not greedy and $1 is a. Therefore, abbb is returned, a
Document. write ("abbbaabbb1234 ". match (re2) + "Var re3 =/([a-z] *) bbb/g; // greedy
Var re4 =/([a-z] *?) Bbb/g; // non-greedy
Document. write (re3.test ("abbbaabbb1234") + "<br/>"); // true
Document.write(re3.exe c ("abbbaabbb1234") + "<br/>"); // null, because g ID is set and greedy match, the above test has greedy match to abbbaabbb, followed by only 1234. Therefore, if no match is found, null is returned.
Document. write ("abbbaabbb1234 ". match (re3) + "Document. write (re4.test ("abbbaabbb1234") + "<br/>"); // true
Document.write(re4.exe c ("abbbaabbb1234") + "<br/>"); // aabbb, aa, because the g ID is set and is not greedy, the above test has greedy match with abbb, followed by aabbb1234, so aabbb is matched here, and $1 is aa, so aabbb, aa is returned.
Document. write ("abbbaabbb1234 ". match (re4) + "
10. Reverse reference
The substring captured by the matching group in the regular expression. Each reverse reference is identified by a number or name and referenced in the \ number notation.
Copy codeThe Code is as follows:
/(\ W +)/. test ("hello-world ");
Dwl (RegExp. $1); // hello
Dwl (/() \ 1/. test ("") // true \ 1 indicates the content in the first parentheses
Dwl ("aa bbb cccc ". replace (/(\ w {2,}) (\ w {2,}) (\ w {2,})/, "$3 $2 $1 ")); // cccc bbb aa

11. Non-capturing group? :
Not every parentheses can be captured through reverse references. Add "? : "To set non-capturing groups.
Copy codeThe Code is as follows:
/(\ W +)-(\ w +)/. test ("hello-world ");
Alert (RegExp. $0); // undefined
Alert (RegExp. $1); // "" will not be captured
Alert (RegExp. $2); // world

12. Are you looking forward? = And negative foresight ?!
(? = Str) matching is followed by str. For example, he can use the/he (? = Llo)/match string "hello"
(?! Str) matching is not followed by str. For example, the mode/he (?! Llo)/match string "hello"
Copy codeThe Code is as follows:
// In this way, (? = Str) or (?! Str) as a condition, and then consider matching other parts. After matching, use the condition for comparison.
Dwl ("he-looworld". match (/(\ w + )(? = World)/g); // llo first matches (\ w +) and finds two blocks, he and looworld, where he does not match (? (? = World), that is, llo links to a world, so llo matches
Dwl ("he-looworld". match (/(\ w + )(?! World)/g); // he, looworld match (\ w +) First, find two blocks, he and looworld, where he does not match (? = World), and looworld, as long as it is not split, it will match (?! World ).

13. How many instance questions are displayed?
A. What is the first letter of all English words in a string in uppercase?
Copy codeThe Code is as follows:
Var str = "hello, hello woRld, I love you ";
Var str = str. toLowerCase (). replace (/\ B \ w | \ s \ w/g, function (s ){
Return s. toUpperCase ();
});
Alert (str)

B. Remove all tags in html code except the tag.
Copy codeThe Code is as follows:
Var str = "<p> <a href = 'HTTP: // www.jb51.net/'> residential buildings </a> </p> Var str = str. replace (/<(?! (\/? A) (. | \ s) *?> /G, ""); // negative foresight is used here
Alert (str); // <a href = 'HTTP: // www.jb51.net/'>

Next, let's talk about common Regular Expressions and summarize them ~~~!!!!! The landlord drank water and wrote such a thing for almost a day...

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.