Javascript Regular Expression supplement

Source: Internet
Author: User

Javascript Regular Expression supplement
Definition

JavaScript Regular Expressions can be defined in two ways, defining a string that matches a string similar to <% XXX %>.

1. Constructor

var reg=new RegExp('<%[^%>]+%>','g');

2. Literal

var reg=/<%[^%>]%>/g;
  • G:Global: full-text search. The first result is searched and stopped by default.
  • I:Ingore case, case Insensitive, case sensitive by default
  • M:Multiple lines, multi-line searchmetacharacters

    One important reason that regular expressions are discouraged is that they have too many escape characters and many combinations. However, regular expressions have metacharacters (special characters with special meanings in regular expressions, can be used to specify its leading characters ).

    Metacharacters: [{^ $ | )? * +.

    Not all metacharacters have specific meanings. In different combinations, metacharacters have different meanings. Let's take a look at the classification.

    Predefine special characters
    Character Description
      Horizontal Tab
      Carriage Return
      Line Break
    Page feed
    CX Control Character corresponding to X (Ctrl + X)
    Vertical Tab
      NULL Character

     

     

     

     

     

     

     

    Character class

    Generally, a regular expression is a character (an escape character is counted as one) that corresponds to a character in a string. The meaning of expression AB is

    However, we can use metacharacters [] to construct a simple class. The so-called class refers to an object that conforms to certain features. It is a generic object, not a specific character, we can use the expression [abc] to classify characters a, B, or c as one type. expressions can match such characters.

    Metacharacters[] A combination can create a class, We can also useMetacharacters ^ create reverse/negative classes,The reverse class indicates that the content does not belong to the XXX class. The expression [^ abc] indicates that the content is not a, B, or c.

    Range

    According to the above description, if we want to match a single number, the expression is like this.

    [0123456789]

    If it is a letter, then ..., the regular expression also provides a range class. We can use x-y to connect two characters to represent any character from x to y. This is a closed interval, that is, it contains x and ybenshen, it is easy to match lowercase letters.

    [A-z]

    What if I want to match all the letters? The classes in the [] structure can be connected. We can also write [a-zA-Z] in this way.

    Predefine class

    We just created several classes using regular expressions to represent numbers, letters, and so on, but this is also very troublesome to write. Regular Expressions provide us with several common predefined classes to match common characters.

    Character Equivalence Class Description
    . [^] All characters except carriage returns and line breaks
    D [0-9] Numeric characters
    D [^ 0-9] Non-numeric characters
    S [] Blank Space Character
    S [^] Non-blank characters
    W A-zA-Z_0-9 Word characters (letters, numbers, and underscores)
    W [^ A-zA-Z_0-9] Non-word characters

     

     

     

     

     

     

     

     

    With these predefined classes, It is very convenient to write some regular expressions. For example, if we want to match a string of AB + number + any characters, we can write abd.

    Boundary

    Regular Expressions also provide several common boundary matching characters.

    Character

    Description

    ^

    Starting with xx

    $

    End with xx

    Word boundary, a character out of [a-zA-Z_0-9]

    B

    Non-word boundary

     

     

     

     

     

     

     

     

    Check out an irresponsible email RegEx match (do not imitate it, as described in parentheses) w + @ w +. (com) $

    Quantifiers

    The methods we introduced previously are all one-to-one matching. If we want to match a string with 20 numbers consecutively, do we need to write it like this?

    Dddd...

    Therefore, some quantifiers are introduced in the regular expression.

    Character Description
    ? Zero or one occurrence (up to one occurrence)
    + Appears once or multiple times (at least once)
    * Appears zero or multiple times (any time)
    {N} Appears n times
    {N, m} N to m times
    {N ,} At least n times
    {, M} Up to m times

     

     

     

     

     

     

     

     

    Let's look at several examples of using quantifiers.

    W + Byron match word + boundary + Byron

     

    (/w+ Byron/).test('Hi Byron'); //true(/w+ Byron/).test('Welcome Byron'); //true(/w+ Byron/).test('HiByron'); //false

    D +. d {} matches three decimal digits

    Greedy mode and non-Greedy Mode

    After reading the quantifiers mentioned above, some questions about the matching principle may be raised by colleagues who love to think about it. For example, if the quantifiers {3, 5} appear ten times in a sentence, in this case, three or five matches are matched at a time. The values 3, 4, and 5 meet the requirements of three to five matches ~ 5. By default, quantifiers are matched as many as possible, that is, the greedy pattern that is often said.

    '123456789'.match(/d{3,5}/g); //[12345, 6789] 

    Since there is a greedy pattern, there will certainly be a non-Greedy pattern, so that the regular expression matches as few as possible, that is to say, once a successful match does not continue to try, the approach is very simple. After the quantifiers are added? You can.

    '123456789'.match(/d{3,5}?/g); //[123, 456, 789]
    Group

    Sometimes we want to match multiple characters when using quantifiers, instead of matching one character as in the above example. For example, we want to match a Byron string that appears 20 times, if we write Byron {20}, We will match Byro + n for 20 times. How can we take Byron as a whole? You can use () to achieve this goal, which is called grouping.

    (Byron) {20}

    What if I want to match Byron or Casper for 20 times? Can use characters | to achieve or

    (Byron | Casper) {20}

    We can see that there is a stuff in #1 in the figure. What is that? The Regular Expression of the Group puts the matching items in the group. By default, the matching items are distributed by number, and the captured group content is obtained by different numbers, this is useful in some functions that require specific operations on the matching items.

    (Byron). (OK)

    If grouping is nested, the number of the group outside is first

    (^ | %>) [^] *)

    Sometimes we don't want to capture some groups, but we just need to add? : No, it does not mean that the group content is not a regular expression, but it does not add a number to the group.

    (? : Byron). (OK)

    In fact, the group name can also be used in C # and other languages, but JavaScript does not support

    Foresight
    Expression Description
    Exp1 (? = Exp2) Match exp1 of exp2
    Exp1 (?! Exp2) Match exp1 that is not followed by exp2

     

     

     

     

    Let's look at an example of good (? = Byron)

    (/good(?=Byron)/).exec('goodByron123'); //['good'](/good(?=Byron)/).exec('goodCasper123'); //null(/bad(?=Byron)/).exec('goodCasper123');//null

    The above example shows that exp1 (? = Exp2) the expression will match the exp1 expression, but it will only match when the content following it is exp2, that is, two conditions, exp1 (?! Exp2) is similar

    Good (?! Byron)

    (/good(?!Byron)/).exec('goodByron123'); //null(/good(?!Byron)/).exec('goodCasper123'); //['good'](/bad(?!Byron)/).exec('goodCasper123');//null
    Reference

    Situ zhengmei JavaScript Regular Expression

     

    With these basic knowledge, you can see the application of regular expressions in JavaScript. Before everything starts, look at several attributes of RegExp instances.

    RegExp instance objects have five attributes

    1. Global: whether to perform global search. The default value is false.
    2. IgnoreCase: whether it is case sensitive. The default value is false.
    3. Multiline: multiline search. The default value is false.
    4. LastIndex: the next position of the last character in the first match of the current expression pattern. The value of lastIndex changes each time the regular expression matches successfully.
    5. Source: text string of the Regular Expression

      In addition to compiling regular expressions into internal formats, this allows faster compile () execution. There are two common methods for objects.

      RegObj. test (strObj)

      The method is used to test whether the regular expression mode is saved in string parameters. If yes, true is returned. Otherwise, false is returned.

       
      var reg=/d+.d{1,2}$/g;reg.test('123.45'); //truereg.test('0.2'); //truereg.test('a.34'); //falsereg.test('34.5678'); //false
       
      RegObj.exe c (strObj)

      The method is used to run the search in the regular expression mode in the string. If exec () finds the matched text, a result array is returned. Otherwise, null is returned. In addition to the array element and length attribute, the exec () method returns two attributes. The index attribute declares the position that matches the first character of the text. The input attribute stores the retrieved string.

      CallNon-GlobalWhen the exec () of the RegExp object is returned, the first element of the returned array is the text that matches the regular expression, the 1st elements are texts that match the 1st sub-expressions of RegExpObject (if any ), the 2nd elements are texts that match the 2nd sub-expressions of the RegExp object (if any), and so on.

      CallGlobalWhen the RegExp object exec () of the RegExp instance, it will start to retrieve the string at the character specified by the lastIndex attribute of the RegExp instance. When exec () finds the text that matches the expression, it sets the lastIndex attribute of the RegExp instance to the next location of the last character matching the text. You can call the exec () method repeatedly to traverse all matched texts in the string. When exec () can no longer find the matching text, it returns null and resets the lastIndex attribute to 0.

      var reg=/d/g;var r=reg.exec('a1b2c3'); console.log(reg.lastIndex); //2r=reg.exec('a1b2c3');console.log(reg.lastIndex); //4

      Result of two r executions

      var reg=/d/g;while(r=reg.exec('a1b2c3')){    console.log(r.index+':'+r[0]);}
      You can see the result:
      1:13:25:3

      In addition to the preceding two methods, some string functions can pass in RegExp objects as parameters for some complex operations.

      StrObj. search (RegObj)

      The search () method is used to retrieve the specified substring in a string or a substring that matches a regular expression. The search () method does not perform global match. It ignores the flag. It also ignores the lastIndex attribute of regexp and always searches from the start of the string, which means it always returns the first matching position of stringObject.

      'a1b2c3'.search(/d/g); //1'a1b2c3'.search(/d/); //1
      StrObj. match (RegObj)

      The match () method retrieves the stringObject string to find one or more texts that match regexp. However, whether regexp has a flag has a significant impact on the results.

      If regexp does not mark g, the match () method can only perform a match in strObj. If no matching text is found, match () returns null. Otherwise, it returns an array containing information related to the matched text it finds. The 0th elements in the array areMatch textWhile the remaining elements are stored with regular expressionsText matched by a subexpression. In addition to these regular array elements, the returned array also contains two object attributes. The index attribute declares the position of the starting character of the matching text in the stringObject, And the input attribute declares the reference to the stringObject.

      var r='aaa123456'.match(/d/); 

      If regexp has a flag, the match () method performs a global search and finds all matched substrings in strObj. If no matched substring is found, null is returned. If one or more matched substrings are found, an array is returned. However, the content of the array returned by global match is very different from that returned by the former. Its array elements store all matched substrings in strObj, and there is no index or input attribute.

      var r='aaa123456'.match(/d/g);

      StrObj. replace (regObj, replaceStr)

      For the replace method of strng objects, we usually use the method of inputting two strings. However, this method has a defect and can only replace once.

      'abcabcabc'.replace('bc','X'); //aXabcabc

      The first parameter of the replace method can also be passed into the RegExp object. When a regular expression is passed in, the replace method is more powerful and flexible.

      'abcabcabc'.replace(/bc/g,'X'); //aXaXaX'abcaBcabC'.replace(/bc/gi,'X'); //aXaXaX

      If the first parameter of the replace method is a regular expression with a group, we can use $1 in the second parameter... $9 to get the content of the corresponding group. For example, if you want to replace <% x %> of string 1 <% 2%> 34 <% 567%> 89 with $ # x # $, we can do this.

      '1<%2%>34<%567%>89'.replace(/<%(d+)%>/g,'@#$1#@');//1@#2#@34@#567#@89

      Of course, there are many ways to achieve this purpose. Here we just demonstrate the use of group content. We use @ # In the second parameter @#$1# @, Where $1 indicates the captured group content. This method is often seen in some js template functions to replace strings.

      StrObj. replace (regObj, function (){})

      You can modify the second parameter of the replace method to make replace more powerful. In the previous introduction, you can only replace all matches with fixed content, but if I want to replace all the numbers in a string, how can I wrap them in parentheses?

      '2398rufdjg9w45hgiuerhg83ghvif'.replace(/d+/g,function(r){    return '('+r+')';}); //(2398)rufdjg(9)w(45)hgiuerhg(83)ghvif

      Pass the second parameter of the replace method into a function. This function will be called during each matching and replacement. This is a callback function that is replaced each time. We use the first parameter of the callback function, that is, the matching content. In fact, the callback function has a total of four parameters.

      1. The first parameter is a string matching.
      2. The second parameter is the content of the regular expression group. If there is no group, this parameter is not provided.
      3. The third parameter is the index of the matching item in the string.
      4. The fourth parameter is the original string.
        '2398rufdjg9w45hgiuerhg83ghvif'.replace(/d+/g,function(a,b,c){    console.log(a+''+b+''+c);    return '('+a+')';}); 2398    0    2398rufdjg9w45hgiuerhg83ghvif9    10    2398rufdjg9w45hgiuerhg83ghvif45    12    2398rufdjg9w45hgiuerhg83ghvif83    22    2398rufdjg9w45hgiuerhg83ghvif 
         

        This is the case where no group exists. The printed content is the Matching content, the matching item index, and the original string. Let's look at an example with a group, if we want to remove the <%> shell of a string, <% 1%> <% 2%> <% 3%> is changed to 123

         
        '<%1%><%2%><%3%>'.replace(/<%([^%>]+)%>/g,function(a,b,c,d){    console.log(a+''+b+''+c+''+d);    return b;}) //123<%1%>    1    0    <%1%><%2%><%3%> <%2%>    2    5    <%1%><%2%><%3%> <%3%>    3    10    <%1%><%2%><%3%> 
         

        According to this replace parameter, many powerful functions can be implemented, especially in complicated string replacement statements.

        StrObj. split (regObj)

        We often use the split method to split strings into character arrays.

        'a,b,c,d'.split(','); //[a, b, c, d]

        Similar to the replace method, we can use a regular expression to solve complicated division problems.

        'a1b2c3d'.split(/d/); //[a, b, c, d]

        In this way, strings can be separated by numbers. Is it very powerful. After reading these two blogs, you can easily use JavaScript regular expressions. You must replace the first letter of an English paragraph in a div with an uppercase letter at the front end. Do you know what to do?


         

         

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.