JavaScript Advanced Programming (Third Edition) learning notes (i) Regular expression collation _ regular expression

Source: Internet
Author: User
Tags eval instance method lowercase true true
1. Creating Regular Expressions
The first way: note that the regular expression here cannot use single or double quotes, as follows
var pattern1 =/[abc]/i; Match the first "a" or "B" or "C", case-insensitive
The second way: created using the RegExp constructor, which passes in two parameters, is a string, so you need to pay special attention to the conversion of the "\" symbol, which requires double escape for all metacharacters (the following meta characters), as follows
Copy Code code as follows:

var patt1 = new RegExp ("[ABC]", "GI"); This is equivalent to var patt1 =/[abc]/gi;
Alert ("[ABC]". Match (PATT1)); Return a
var patt2 = new RegExp ("\\[abc\\]", "GI"); This is equivalent to var patt2 =/\[abc\]/gi; And within the quotation marks "\" must be transferred to "\"
Alert ("[ABC]". Match (PATT2)); return [ABC]

Here are two questions to consider:
A, if the regular expression here is a dynamic combination of strings and a variable, how do you create a regular expression using the first method?
Obviously, the second way to create a regular expression is definitely not a problem, because the first argument passed in itself is a string. To create in the first way, you need to use the Eval () function, as follows
Copy Code code as follows:

var str = "ABC"; This could be a dynamic variable
var patt1 = eval ("/\\[" +str+ "\\]/"); Equivalent to var patt1 =/\[abc\]/;
Alert ("[ABC]". Match (PATT1)); return [ABC]

b, what is the difference between the above two methods of creation?
"JavaScript Advanced Programming (third edition)" said: The difference is whether to share, using the first way to create the regular, sharing, the second is not shared. The landlord practice found that the results are somewhat different, the two ways of creating results, are (true true), you can test yourself. So mark here, there's no difference between these two ways of creating.
Copy Code code as follows:

var re = null,i;
for (i=0;i<3;i++) {
Re =/cat/g;//Book: Instance properties are not reset and the following result should be true false
Alert (Re.test ("CATASDFDFDF"));
}
for (i=0;i<3;i++) {
Re = new RegExp ("Cat", "G");//Book: Instance properties are reset and the following result should be true true
Alert (Re.test ("CATASDFDFDF"));
}

Here, by the way, three flags for matching patterns: g, I, M
G: After the G is set, the pattern is applied to all strings, the first match is not stopped immediately, the lastindex of the pattern is moved to the next position, and the next time the pattern is applied, the match starts again from Lastindex, if the match is final, Lastindex will be reset to 0;
I: This is well understood, case-insensitive;
M: multiple-line mode, that is, when the end of a line of text will continue to find the next line, to match;
2, meta character: ([{^ $ |)? * +.]}
These words have one or more special uses in the expression, so if you want to match these characters, you need to escape the regular. Such as:
Copy Code code as follows:

var pattern1 =/\[abc\]/i; Match the first "[ABC]", case-insensitive
var pattern2 =/[abc]/i; Match the first "a" or "B" or "C", case-insensitive

3. RegExp Instance Properties
Global, IgnoreCase, Multiline, lastindex, source, these properties are not useful, where lastindex can be used for debugging, simple example
Copy Code code as follows:

var patt1 =/cat/g;
Patt1.test ("Catasdfdfdf");
DW (Patt1.global); If G is set, Global mode//True
DW (patt1.ignorecase); I is set, case-insensitive//False
DW (Patt1.multiline); If M is set, multiple lines match, at the end of line, can continue to match next line//False
DW (Patt1.lastindex); Indicates the start of the search for the next occurrence, 0//3 for the first time
DW (Patt1.source); Return regular expression literal//cat

4. Range Collection class: [] ^ |
[ABC], representing any of the characters in a or B or C
[A-z], [A-z], [0-9], a number that represents lowercase letters, uppercase letters, 0 to 9
[^a-z], [^a-z], [^0-9], for non-lowercase letters, not uppercase letters, numbers not 0 to 9
[Abc|def], representing any one of the ABC and DEF
Copy Code code as follows:

Alert (/[abc]/.test ("a"));//true
Alert (/[abc]/.test ("GG"));//false
Alert (/[^abc]/.test ("a"));//false
Alert (/[^abc]/.test ("GG"));//true
Alert (/[a-z]/.test ("a"));//false
Alert (/[a-z]/.test ("A"));//true
Alert (/[abc|def]/.test ("Def"));//true

5, classifier Category:? * + {m} {M,n} {m,}
? 0 or 1 times, placed behind the classifier, indicating that the non-greedy products match, explained later
* 0 or more times
+ 1 or more times
{m} m times
{m,n} at least m times, up to N times
{m,} at least m times
Copy Code code as follows:

Alert (/a?/.test ("a"));//true
Alert (/a?/.test ("B"));//true can appear 0 times
Alert (/a*/.test ("a"));//true
Alert (/a*/.test ("B"));//true can appear 0 times
Alert (/a+/.test ("a"));//true
Alert (/a+/.test ("B"));//false
Alert (/a{3}/.test ("AAAAA"));//true
Alert (/a{3}/.test ("bbbbb"));//false See some articles on the Internet, this should be 0 or 3 times, where the landlord tested several browsers, not 0 times
Alert (/a{3,5}/.test ("AAAAA"));//true
Alert (/a{3,5}/.test ("bbbbbb"));//false
Alert (/a{3,}/.test ("AAAAA"));//true
Alert (/a{3,}/.test ("bbbbbb"));//false

6. Boundary class: ^ $ \b \b
^ denotes the beginning of a flag, note that it cannot be followed by the left middle bracket, such as [^a-z], which means that the non
$ indicates the end of the flag
\b The sign of a word boundary so that it represents the invisible thing between words, one side is a word character, the other is a non-word character (including various punctuation and whitespace characters or Chinese)
\b A label that is not a word boundary
Copy Code code as follows:

Alert (/^ $/.test ("Habitat")); Only one home true
Alert (/^ $/.test ("Habitat")); Only one home false
Alert (/\b Ah/.test ("a Ah")); True think: Why this is not the same as the following results??
Alert (/\b Ah/.test ("-ah")); False
Alert (/\b Ah/.test ("a Ah")); False
Alert (/\b Ah/.test ("-ah")); True

Alert (/\b Ah/.test ("a Ah")); , there is a \b between "a" and "ah", because the \b's left is "a" and the right is "ah", so it can match, true;
Instead: Alert (/\b Ah/.test ("-ah")); , "Ah" has a \b on the left, but \b's left is "-", not a word character, so, cannot match, false
7, predefined categories: \d \d \s \s \w \w.
\d represents 0-9 of numeric characters, equivalent to [0-9]
\d represents a number character other than 0-9, equivalent to [^0-9]
Copy Code code as follows:

Alert (/\d/.test ("1"));//true
Alert (/\d/.test ("1"));//false

\s white space character, equal to [\n\r\f\t\x0b], note space also count
\s non-whitespace characters, equivalent to [^\n\r\f\t\x0b]
Copy Code code as follows:

Alert (/\s/.test (""));//true, spaces are counted
Alert (/\s/.test ("\n\r\f\t\x0b"));//false
Alert (/\s/.test ("\n\r\f\t\x0b \ \"));//true

\w word characters, equivalent to [a-za-z0-9_]
\w non-word characters, equivalent to [^a-za-z0-9_]
Copy Code code as follows:

Alert (/\w/.test ("Afdas"));//true
Alert (/\w/.test ("Afdas"));//false

(point). indicates any character except \ n and \ r is unexpected, equivalent to [^\n\r]
Copy Code code as follows:

Alert (/./.test ("\n\r"));//false, only these two, others can be. The
Alert (/./.test (""));//true

8, regexp instance method: Exec () test () match ()
EXEC (): Returns an array of the first occurrence information that failed to match the successful return null, usage: pattern.exec (str); Need to be aware that there is no "G" of the same
Copy Code code as follows:

var Re1 =/([a-z]*) bbb/;//Greed
document.write (Re1.test ("abbbaabbb1234") + "<br/>");//true
document.write (Re1.exec ("abbbaabbb1234") + "<br/>");//abbbaabbb,abbbaa, where greed matches to ABBAABBB, then $ Abbbaa, So return to Abbbaabbb,abbbaa at this time
var Re1 =/([a-z]*) bbb/g;//Greed
document.write (Re1.test ("abbbaabbb1234") + "<br/>");//true
document.write (Re1.exec ("abbbaabbb1234") + "<br/>")//null//Because the G ID is set and the greedy match, and the above test has been greedy to match to the ABBBAABBB, Only 1234 behind, so it's not matched at this point, and returns null

Test (): In case you only need to know if a match is needed and what text you don't want to know exactly, use this to put it in a more convenient, usage: pattern.test (str);
Copy Code code as follows:

var Re1 =/([a-z]*) bbb/;//Greed
document.write (Re1.test ("abbbaabbb1234") + "<br/>");//true
document.write (Re1.test ("abbbaabbb1234") + "<br/>");//true
document.write (Re1.test ("abbbaabbb1234") + "<br/>");//true
var Re1 =/([a-z]*) bbb/g;//Greed
document.write (Re1.test ("abbbaabbb1234") + "<br/>");//true
document.write (Re1.test ("abbbaabbb1234") + "<br/>");//false to understand why this is false, because the G is set, the match is starting at 1.
document.write (Re1.test ("abbbaabbb1234") + "<br/>");//true

Match (): This function is special, in the case of setting G and not set, the representation is completely different, in the case of G, the expression and exec, set G, will return all the matching values of the set, usage: Str.match (pattern)
Copy Code code as follows:

var Re1 =/([a-z]*) bbb/;//Greed
document.write ("abbbaabbb1234". Match (Re1) + "<br/>");//abbbaabbb,abbbaa here abbbaabbb is the entire string to match to, Abbbaa is the string that matches the first parenthesis
var Re1 =/([a-z]*) bbb/g;//Greed
document.write ("abbbaabbb1234". Match (Re1) + "<br/>");//abbbaabbb, after G is set, match () returns all matching values

Finally, the EXEC () and the match () of G are not set, the first element of the returned array is the entire matching string, if there are parentheses in the pattern, then the second element is the match of the first bracket, and so on, the third, the fourth ... Like what:
Copy Code code as follows:

var Re1 =/(A (b (c))) d/;
var str = "ABCDD";
var matches = Str.match (Re1);
Alert (matches[0]);//ABCD//Here's the whole string.
Alert (matches[1]);//ABC//Here is the first parenthesis
Alert (matches[2]);//BC//Here is the second parenthesis
Alert (matches[3]);//c//Here is the third parenthesis.

9, greed and non-greedy match?
Greedy match: Match to After, continue to match back to the end of the string, and then select the longest. For example: For the string "Aaaaaab", to match the/a+/, then the match will be "aaaaaa", rather than a "a".
Non-greedy match: After matching, stop immediately. For example: For the string "Aaaaaab", to match the/a+?/, then the match will be "a", not a "aaaaaa". The use is to add "?" after the quantifier.
Copy Code code as follows:

var re1 =/a+/;
var str = "AAAAAAA";
Alert (Str.match (Re1));//AAAAAAA
var re1 =/a+?/;
var str = "AAAAAAA";
Alert (Str.match (Re1));/A

Here's a comprehensive example: the difference between greed and non-greed, global g,exec and match???
Copy Code code as follows:

var Re1 =/([a-z]*) bbb/;//Greed
var Re2 =/([a-z]*?) bbb/;//not greedy.
document.write (Re1.test ("abbbaabbb1234") + "<br/>");//true
document.write (Re1.exec ("abbbaabbb1234") + "<br/>");//abbbaabbb,abbbaa, where greed matches to ABBAABBB, then $ Abbbaa, So return to Abbbaabbb,abbbaa at this time
document.write ("abbbaabbb1234". Match (Re1) + "document.write (Re2.test ("abbbaabbb1234") + "<br/>");//true
document.write (Re2.exec ("abbbaabbb1234") + "<br/>");//abbb,a, this is not greedy to match to ABBB and then to a, so return to abbb,a at this time
document.write ("abbbaabbb1234". Match (Re2) + "var Re3 =/([a-z]*) bbb/g;//Greed
var Re4 =/([a-z]*?) bbb/g;//not greedy.
document.write (Re3.test ("abbbaabbb1234") + "<br/>");//true
document.write (Re3.exec ("abbbaabbb1234") + "<br/>");//null, because the G ID is set and the greedy match, and the above test is greedy to match to the ABBBAABBB, Only 1234 behind, so it's not matched at this point, and returns null
document.write ("abbbaabbb1234". Match (Re3) + "document.write (Re4.test ("abbbaabbb1234") + "<br/>");//true
document.write (Re4.exec ("abbbaabbb1234") + "<br/>")//aabbb,aa, because the G identity is set and is not greedy match, and the above test has been greedy match to ABBB, There is only aabbb1234, so at this point the match to AABBB, at this time is AA, so return AABBB,AA
document.write ("abbbaabbb1234". Match (Re4) + "

10. Reverse Reference
The substring of the matching group capture in the regular expression. Each reverse reference is identified by a number or name and is referenced by the "\ Numbering" notation.
Copy Code code as follows:

/(\w+)/.test ("Hello-world");
DWL (regexp.$1);//hello
DWL (/() \1/.test ("Habitat")//true the \1 here represents the contents of the first parenthesis.
DWL ("AA bbb CCCC". Replace (/(\w{2,}) (\w{2,}) (\w{2,})/, "$ $"); CCCC BBB AA

11, non-capture grouping?:
Not every parenthesis can be caught by a reverse reference, followed by a "?:" in parentheses, to set up a non-capturing grouping.
Copy Code code as follows:

/(\w+)-(\w+)/.test ("Hello-world");
alert (regexp.$0);//undefined
alert (regexp.$1);//"" will not be caught
alert (regexp.$2);//world

12, forward forward? = and negative Outlook?!
(? =str) match followed by STR, such as he can pass the mode/he (? =llo)/Match string "Hello"
(?! STR) match followed not by STR, for example Hel can be/he by mode (?!). Llo)/Match string "Hello"
Copy Code code as follows:

This understanding, will (? =str) or (?! STR) as a condition, and then consider matching the other parts, after the match, and then take the conditions for comparison, see the character does not conform to
DWL ("He-lloworld". Match (/(\w+) (? =world)/g)); Llo here first match (\w+), find two blocks, he and Lloworld, which he does not conform to (? =world), and Lloworld, as long as the Llo and world connection, it is in line with (? =world), that is, Llo linked to a world, So it matches the Llo.
DWL ("He-lloworld") match (/(\w+) (?!) World)) (/g)); He,lloworld here first match (\w+), find two pieces, he and Lloworld, which he does not conform to (? =world), and Lloworld, as long as not to dismantle, on the line (?!) World), that is, Llo linked to a world, so here's a match for He,lloworld

13, a few examples of problems?
A, speak a string of all English words in the first letter capital?
Copy Code code as follows:

var str = "Hi Hello world, I am Loving You";
var str = str.tolowercase (). Replace (/\b\w|\s\w/g,function (s) {
return S.touppercase ();
});
Alert (str)

b, remove all tags in the HTML code, except a tag
Copy Code code as follows:

var str = "<p><a href= ' http://www.jb51.net/" > Habitat Habitat </a ></p>var str = str.replace (/< (?!) (\/?a)) (.| \s) *?>/g, ""); There's a negative outlook.
alert (str); <a href= ' http://www.jb51.net/' > Ju-ju-ju-ju-ju-ju </a > by-ju-ju-ju-ju-Habitat

In the next section, talk about common regular expressions, and summarize ~~~!!!!! Landlord drink saliva, so something wrote a quick day ...
Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.