Understanding javascript Regular Expressions and Regular Expressions

Source: Internet
Author: User
Tags character classes expression engine

Understanding javascript Regular Expressions and Regular Expressions

Learn about the RegExp type:

ECMAScript supports regular expressions through the RegExp type. Var expression =/pattern/flags;

Pattern:

It can be any simple or complex regular expression that can contain character classes, delimiters, grouping, forward lookup, and reverse references. For more information about the meanings of special characters (such as \, ^, $, \ w, \ B) in a regular expression, see MDN regular expression-special characters. Here we will briefly introduce forward search and reverse reference.

Forward search:The regular expression uses the forward position of some characters without moving these characters. It is divided into forward and forward pre-searches, which are also called forward affirmative searches (x (? = Y) and negative forward pre-search are also called forward negative Lookup (x (?! Y )).
Reverse reference:Identifies the repeated characters or strings that can be provided in a string. You can use the capture group to reverse reference matching. The numbered reverse reference \ number is the serial number position of the capture group in the regular expression.
1. Expression \ 1 ~ \ 9 is interpreted as reverse reference rather than gossip code. /\ B (\ w +) \ s \ 1/. exec ('s _ s _ '); // ["s _", "s _"]
2. If the first digit of a multiple-digit expression is 8 or 9 (for example, \ 80 or \ 91), the expression is interpreted as text. /\ B (\ w +) \ s \ 80/. exec ('s _ 800 '); // ["s _ 80", "s _"]
3. For expressions with numbers \ 10 or greater, if there is a reverse reference corresponding to the number, the expression is considered as a reverse reference. Otherwise, these expressions are interpreted as octal.

/(1)(2)(3)(4)(5)(6)(7)(8)(9)(10)xx\10/.exec('12345678910xx10');//["12345678910xx10", "1", "2", "3", "4", "5", "6", "7", "8", "9", "10"]/(1)(2)(3)(4)(5)(6)(7)(8)(9)(10)xx\11/.exec('12345678910xx10');//null

4. If the capture group is nested, the capture group determines the order from internal to internal, and from left to right. Let's take a look at the code.
/\ B (\ w + x (x) \ s (\ 1 )/. exec ('s _ xx s_xxSTOP '); // ["s_xx", "s_xx", "x", "s_xx"]
5. If the regular expression contains reverse references to undefined group members, an analysis error will occur. Different Regular Expression Engines Based on languages will cause ArgumentEXception. For javascript, null is returned. /\ B (\ w +) \ s \ 2/. exec ('s _ 8'); // null
Reverse reference instance code:The content captured by the capture group can be referenced by a program outside the regular expression (RegExp. $ n) can also be referenced inside the regular expression (\ number, this reference method is reverse reference ).

// Three identical lowercase letters are displayed consecutively. {2} is applied to \ 1/([a-z]) \ 1 {2 }/. exec ('aaa'); // ["aaa", "a"] copy the code // an interesting Regular Expression question/(\ w )((? = \ 1 \ 1 \ 1) (\ 1 ))/. exec ('aa bbbbbb'); // ["bb", "B"]/* There are three capture groups, $1 is the content in (\ w), and $2 is ((? = \ 1 \ 1 \ 1) (\ 1) content: Pay attention (? = \ 1 \ 1 \ 1) is not a condition for judging the regular expression instead of a capture group. x (? = Y) indicates that matching x is only followed by y, and the judgment condition is not part of the matching result. So the content of $2 is (\ 1), that is, 'B '. $3 is the content of \ 1. The First 'B' in the returned match "bb" is the First 'B' in "aa bbbb ', the second 'B' is the second 'B' in "aa bbbb '. * // (\ W) (x (? = \ 1 \ 1 \ 1) (\ 1 ))/. exec ('aa bxbbbcb '); // ["bxb", "B", "xb", "B"] // the $2 content here is (x (? The content in = \ 1 \ 1 \ 1) (\ 1) is x (\ 1); // In fact, the above two modes can be simplified to/(\ w )(? = \ 1 \ 1 \ 1) (\ 1)/indicates matching \ w. Only when this \ w is followed by three \ 1, the obtained match is the \ w string followed by the \ 1 string. Similarly,/(\ w) x (? = \ 1 \ 1 \ 1) (\ 1)/copy code/(\ w )((? = \ 1 \ 1 \ 1) (\ 2 ))/. exec ('aa bbbbv'); // ["B", "B", "", ""]/* The capture group $2 is ((? = \ 1 \ 1 \ 1) (\ 2). \ 2 indicates "" because the matching at $2 of the capture group has not been completed yet "". $3 is the content of \ 2 or "". Therefore, this match is interpreted as returning \ w followed by three strings of this \ w. If \ w + ''is returned, only 'B' is returned. */

The flags part of the regular expression:

It can contain one or more signs to indicate the behavior of a regular expression.

1. g: indicates the global mode. The mode is applied to all strings instead of stopping immediately when the first match is found. 'Cat mat batt'. replace (/.(? = At)/g, 'A'); // "Aat"
2. I: case-insensitive mode. when determining a match, ignore the case-insensitive mode and string. 'Cat mat bat'. replace (/a/gi, 'B'); // "cBt mBt bBt"
3. m: multi-line mode. When the end of a line of text is reached, the system will continue to find whether there are items matching the mode in the next line.

var str='cat\nmat\nbat';str.replace(/at/gm,'AB');/*"cABmABbAB"*/

Metacharacters in Regular Expressions:

When using these metacharacters in the mode, they must be escaped. If you want to match strings that contain these metacharacters, you need to escape them.

([{\ ^ $ | )? * +.]} // Match "[bc] at"/\ [bc \] /. exec ("xx [bc] at"); // ["[bc] at"] // match ". at "/\. at /. exec ("xx. at ");//[". at "]

Create a regular expression:

Literal form:Such as var expression =/pattern/flags;
RegExp constructor:Two parameters (the string pattern to be matched, the optional flag string) cannot pass the regular expression literal to the constructor, even if it is written in this way, no error is reported. Any expressions that can be defined literally can be defined using constructors. As follows:

var p=/[bc]at/;new RegExp('[bc]at');// /[bc]at/

1. new RegExp ();///(? :)/Or new RegExp ('');///(? :)/, Indicates matching "", but does not remember the matching item ("" is actually ":" After the empty string, do not remember the rule of x match is (? : X). Therefore, if any string is matched, [""] is returned. Therefore, we can guess that the internal mechanism of the javascript Regular Expression Engine should be the default match "and do not remember this match, unless the string to be matched after": "is explicitly declared, add "(? :) "Explicit declaration does not remember matching items.
2. Because the constructor mode parameter is a string, in some cases (it refers to characters that have been escaped) double escape of characters (that is, a single escape in the literal form is followed by another escape ). In some cases, you can also perform a single re-transfer (new RegExp ('\ W'); // w /). Note that '\' is special and must be escaped in strings.

Var p = // \ n/; // escape \, character "\" must be often escaped as "\" p.exe c ("\ nxx "); // ["\ n"] var p = new RegExp ("\\\\ n "); /// \\ n/If you want to obtain the regular expression literal: // \\ n/, then define p.exe c ('\ nxx') in the regular expression '); // ["\ n"] note that \ n's \ n in the matched string '\ nxx' is also escaped with new RegExp (' \ n'cmd.exe c ("\ n "); // [""]/* RegExp ('\ n') returns/\ n /, that is, match the line break */new RegExp ('\ n'cmd.exe c ("\ n"); // [""]/* new RegExp (' \ n') returns //, it indicates that no escape is performed. Instead, it returns the literal value //, indicating that the line break is matched */

3. Here are some references for the single-and double-escape modes: The first escape has been marked in the table. A single escape indicates the first escape, double represents the escape based on the existing escape.

RegExp instance attributes:

Various information about the mode can be obtained through instance attributes.

Global: Boolean value, indicating whether the g flag is set.
IgnoreCase: Boolean value, indicating whether the I flag is set.
Multiline: Boolean value, indicating whether the m flag is set.
LastIndex: an integer that indicates the character position of the next matching item to be searched, starting from 0. It is useful only when the g flag is set.
Source: The string flag of the regular expression. The string is returned in the literal form instead of the string mode in the constructor.

New RegExp ('\\\\ W'); // \\\ w/returns a new RegExp (' \\\\ W') in the form of a self-surface number '). source; // "\ w" String

RegExp instance method:

Exec ():This method is specially designed for the capture group. The parameter is the string to be matched, and the array containing the first matching item information and the possible capture group is returned. If it does not match, null is returned. (The returned result is an Array instance, but it also contains two additional attributes: index indicates the position of the matching item in the string, and input indicates the string using the regular expression)

Var arr = new RegExp ('\\\\(ww.'cmd.exe c (' \ W'); // ["\ w", "w"] arr; // ["\ w", "w"] arr. index; // 0arr. input; // "\ w" refers to the content in exec ().

The difference between exec () and match () methods:

1. For exec (), even if the global flag is set in the mode, it returns only one matching item at a time; match () of the string () when setting g, you can return all the matching items without capturing the group and the returned array does not have the index and input attributes.

2. For exec (), the capture group can be returned, but match () can return the capture group only if there is no global g flag. In this case, the array returned by match () has the index and input attributes.

// Return global match to demonstrate comparison of var arr = 'ababcdab '. match (/AB/g); // ["AB", "AB", "AB"] arr. index; // undefinedarr. input; // undefined/AB/g.exe c ('ababcdab'); // ["AB"] // capture group demo comparison, match () the method is related to 'ababcdab '. match (/a (B)/g); // ["AB", "AB", "AB"] var arr = 'abcdab '. match (/a (B)/); // ["AB", "B"] arr. index; // 0arr. input; // 'ababcdab'/a (B)/g.exe c ('ababcdab'); // ["AB", "B"]

3. When selecting a method, you must first consider which features of the method should be emphasized. If the global flag is not set, when exec () is called multiple times on the same string, the first matching item is always returned. When the global flag is set, exec () is called each time () the new match will be searched in the string following the position of the last query.

// No global configuration is set. var p =/a/; var str = 'ababa '; var a1_p.exe c (str); // ["a"]; var B =p.exe c (str); // ["a"]; a = B; // falsea. index = B. index; // true // set global, continue searching for new match var p =/a/g along the last position; var str = 'ababa '; var a1_p.exe c (str ); // ["a"]. index; // 0var bw.p.exe c (str); // ["a"] B. index; // 2

Test ():Receives string parameters. If the mode matches the string parameter, true is returned. Otherwise, false is returned. It is often used in if () when judgment conditions.

Var text = "000-000-000"; var p =/(\ d {3})-) \ 1 * \ 2/; if (p. test (text) {console. log ('matched successfully ');}

RegExpThe toLocaleString () and toString () Methods of the Instance inherited Object return strings in the literal form of the regular expression, regardless of how to create a regular expression. ValueOf () returns the regular expression literal.

var p=/\[new\]bi/;p.toLocaleString();// "/\[new\]bi/"p.toString();// "/\[new\]bi/"p.valueOf();// /\[new\]bi/var p=new RegExp('\\[new\\]bi');p.toLocaleString();// "/\[new\]bi/"p.toString();// "/\[new\]bi/"p.valueOf();// /\[new\]bi/

RegExp constructor attributes:

The constructor itself contains some attributes (static attributes) that apply to all expressions in the scope and change based on the last executed Regular Expression operation. There are long attribute names (the following code) and short attribute names (that is, the $ prefix form. Because most of these symbols are not valid ECMAScript identifiers, they cannot be directly used in the RegExp constructor ". you must use the square brackets syntax to access these attributes .)

/(.) Hort/g.exe c ('this is a short Day'); // ["short", "s"] // The string RegExp to be matched last time. input; // "this is a short day" or RegExp ["$ _"] access; // the last match RegExp. lastMatch; // "short" or RegExp ["$ &"] access; // The text RegExp before the most recent match in the string to be matched. leftContext; // "this is a" or RegExp ["$ '"] access; // The text RegExp after the most recent match in the string to be matched. rightContext; // "day" or RegExp ["$ '"] access; // The last (last) matched capture group RegExp. lastParen; // "s" or RegExp ["$ +"] access;

Access attribute of the capture group: there are also nine constructor attributes used to store the capture group. The access syntax is RegExp. $ n, where n is set to 1 ~ 9, used to obtain the nth matching capture group. These attributes are automatically filled when you call regular series methods such as exec (), test (), or match.

var text="this is a short summer";var pattern =/(..)or(.)/g;if(pattern.test(text)){ console.log(RegExp.$1); // sh console.log(RegExp.$2); // t }

Limitations of the Mode:

Some features of advanced regular expressions are missing, such as backward lookup and named capture groups (for example, strings in the capture group named \ k <name>.

Recommended topics: Javascript Regular Expression instructions

The above is all the content of this article, hoping to help you learn.

Articles you may be interested in:
  • Proficient in JS Regular Expressions (recommended)
  • Detailed explanation of the js Code of the regular expression in the verified email address
  • Javascript can only enter regular expressions such as numbers, numbers, and letters.
  • JS Regular Expressions (detailed and practical)
  • Js verification phone number and mobile phone support + 86 Regular Expression
  • Verify the digital code using the JS Regular Expression
  • Differences between the test, exec, and match methods in js Regular Expressions
  • How to obtain special characters in a string using a JS Regular Expression
  • Usage of question marks in js Regular Expressions

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.