Regular Expressions (JavaScript)

Source: Internet
Author: User
Tags alphabetic character character classes

Regular Expressions (JavaScript)

Regular Expressions (JavaScript)
    • Regular Expressions (JavaScript)
      • 1. Understanding Regular Expressions
        • 1.1. What is a regular expression
        • 1.2. Common Regular expression Matching tool
      • 2. Regular expression syntax
        • 2.1. Create a regular expression
        • 2.2. Meta-characters
        • 2.3. Character classes and range classes and boundaries
        • 2.4. quantifiers
      • 3. Regular expression step-by-step
        • 3.1. Grouping and OR
        • 3.2 Outlook
      • 4. Regular expressions in JavaScript
        • 4.1. Object Properties
        • 4.2. Test and Exec methods
        • 4.3. String Object Methods
      • 5. Examples of regular expressions
1. Understand the regular expression 1.1. What is a regular expression

The regular expression (regular expression) describes a pattern of string matching that can be used to check whether a string contains a seed string, replaces a matched substring, or extracts a substring that matches a certain condition from a string.

    • When a directory is listed, the. txt in dir. txt or ls . txt is not a regular expression because the meaning of the regular-style * is different.
    • The method of constructing a regular expression is the same as the method for creating a mathematical expression. That is, using a variety of meta-characters and operators to combine small expressions together to create larger expressions. A component of a regular expression can be a single character, a character set, a range of characters, a selection between characters, or any combination of all of these components.

Regular expressions are text patterns that consist of ordinary characters, such as characters A through z, and special characters (called metacharacters). The pattern describes one or more strings to match when searching for text. A regular expression, as a template, matches a character pattern to the string you are searching for.

1.2. Common Regular expression Matching tool
    • Online Matching tool:
      1. Www.regexper.com
      2. Www.regexpal.com
      3. Www.rubular.com
    • Regular matching software:
      Mctracer
2. Regular expression Syntax 2.1. To create a regular expression

We have two ways of constructing regular expressions:

    1. Constructs a regular expression object using the RegExp object constructor.
      var reg = new RegExp("abc");
    2. Use the slash (/) character to enclose the pattern and generate a literal value.
      var reg = /abc/;
2.2. Meta-characters

Regular expressions consist of two basic character types:

  • literal literal characters
  • Metacharacters

Metacharacters refers to non-alphabetic characters that have special meanings in regular expressions, and the following table contains a complete list of metacharacters and their behavior in the context of a regular expression:

character Description
\ Marks the next character as a special character, or a literal character, or a backward reference, or an octal escape. For example, ' n ' matches the character "n". ' \ n ' matches a line break. The sequence ' \ ' matches "\" and "(" Then matches "(".
^ Matches the starting position of the input string. If the Multiline property of the RegExp object is set, ^ also matches the position after ' \ n ' or ' \ R '.
$ Matches the end position of the input string. If the Multiline property of the RegExp object is set, it also matches the position before ' \ n ' or ' \ R '.
* Matches the preceding subexpression 0 or more times. For example, zo* can match "z" and "Zoo". * Equivalent to {0,}.
+ Matches the preceding subexpression one or more times. For example, ' zo+ ' can match "Zo" and "Zoo", but not "Z". + equivalent to {1,}.
? Matches the preceding subexpression 0 or one time. For example, "Do (es)?" can match "do" in "do" or "does".? Equivalent to {0,1}.
N N is a non-negative integer. Matches the determined n times. For example, ' o{2} ' cannot match ' o ' in ' Bob ', but can match two o in ' food '.
{N,} N is a non-negative integer. Match at least n times. For example, ' o{2,} ' cannot match ' o ' in ' Bob ', but can match all o in ' Foooood '. ' O{1,} ' is equivalent to ' o+ '. ' O{0,} ' is equivalent to ' o* '.
{N,m} Both M and n are non-negative integers, where n <= m. Matches at least n times and matches up to M times. For example, "o{1,3}" will match the first three o in "Fooooood". ' o{0,1} ' is equivalent to ' O? '. Note that there can be no spaces between a comma and two numbers.
? When the character immediately follows any other restriction (*, +,?, {n}, {n,}, {n,m}), the matching pattern is non-greedy. The non-greedy pattern matches the searched string as little as possible, while the default greedy pattern matches as many of the searched strings as possible. For example, for the string "oooo", ' o+? ' will match a single "O", while ' o+ ' will match all ' o '.
. Matches any single character except "\ n". To match any character including ' \ n ', use the image "(.
(pattern) Match pattern and get this match. The obtained matches can be obtained from the resulting Matches collection, the Submatches collection is used in VBScript, and the 9 attribute is used in JScript. To match the parentheses character, use ' (' or ') '.
(?:p Attern) Matches pattern but does not get a matching result, which means that this is a non-fetch match and is not stored for later use. This is used in the "or" character (
(? =pattern) Forward-checking matches the lookup string at the beginning of any string that matches the pattern. This is a non-fetch match, which means that the match does not need to be acquired for later use. For example, ' Windows (? =95
(?! Pattern A negative pre-check matches the lookup string at the beginning of any string that does not match the pattern. This is a non-fetch match, which means that the match does not need to be acquired for later use. For example ' Windows (?! 95
X Y
[XYZ] The character set is combined. Matches any one of the characters contained. For example, ' [ABC] ' can match ' a ' in ' plain '.
[^XYZ] Negative character set. Matches any character that is not contained. For example, ' [^ABC] ' can match ' P ', ' l ', ' I ', ' n ' in ' plain '.
[A-z] The character range. Matches any character within the specified range. For example, ' [A-z] ' can match any lowercase alphabetic character in the ' a ' to ' Z ' range.
[^a-z] A negative character range. Matches any character that is not in the specified range. For example, ' [^a-z] ' can match any character that is not within the range of ' a ' to ' Z '.
\b Matches a word boundary, which is the position between a word and a space. For example, ' er\b ' can match ' er ' in ' never ', but not ' er ' in ' verb '.
\b Matches a non-word boundary. ' er\b ' can match ' er ' in ' verb ', but cannot match ' er ' in ' Never '.
\cx Matches the control character indicated by X. For example, \cm matches a control-m or carriage return. The value of x must be one of a-Z or a-Z. Otherwise, c is treated as a literal ' C ' character.
\d Matches a numeric character. equivalent to [0-9].
\d Matches a non-numeric character. equivalent to [^0-9].
\f Matches a page break. Equivalent to \x0c and \CL.
\ n Matches a line break. Equivalent to \x0a and \CJ.
\ r Matches a carriage return character. Equivalent to \x0d and \cm.
\s Matches any whitespace character, including spaces, tabs, page breaks, and so on. equivalent to [\ f\n\r\t\v].
\s Matches any non-whitespace character. equivalent to [^ \f\n\r\t\v].
\ t Matches a tab character. Equivalent to \x09 and \ci.
\v Matches a vertical tab. Equivalent to \x0b and \ck.
\w Matches any word character that includes an underscore. Equivalent to ' [a-za-z0-9_] '.
\w Matches any non-word character. Equivalent to ' [^a-za-z0-9_] '.
\xn Match N, where n is the hexadecimal escape value. The hexadecimal escape value must be two digits long to determine for example, ' \x41 ' matches ' A '. ' \x041 ' is equivalent to ' \x04 ' & ' 1 '. ASCII encoding can be used in regular expressions.
\num Matches num, where num is a positive integer. A reference to the obtained match. For example, ' (.) \1 ' matches two consecutive identical characters.
\ n Identifies an octal escape value or a backward reference. n is a backward reference if \ n is preceded by at least one of the sub-expressions obtained. Otherwise, if n is the octal number (0-7), N is an octal escape value.
\nm Identifies an octal escape value or a backward reference. If at least NM has obtained a subexpression before \nm, then NM is a backward reference. If there are at least N fetches before \nm, then n is a backward reference followed by the literal m. If none of the preceding conditions are met, if both N and M are octal digits (0-7), then \nm will match the octal escape value nm.
\nml If n is an octal number (0-3) and both M and L are octal digits (0-7), the octal escape value NML is matched.
\un Match N, where N is a Unicode character represented by four hexadecimal digits. For example
\u00a9 Match the copyright symbol (?).
2.3. Character classes and range classes and boundaries

When we want to take a certain type of character, we can use the form of brackets and characters, for example: [ABC] matches the characters of the elements inside the brackets, such as can match "a" in "plain".
The regular expression also provides a range class, we can use [A-z] to match A to Z character, and if you want to match all uppercase and lowercase letters, you can use [a-za-z].

Use ^ in brackets for inverse, such as [^a-z] to match characters that are not in the A-Z range. However, the starting position of the matching input string is represented in the regular expression.

Regular expressions provide us with some predefined classes to facilitate our use, such as., \d, \w, \s, and so on, as shown in the table above.
Regular expressions also provide several commonly used boundary-matching characters, as follows:

character meaning
^ Matches the starting position of the input string. If the Multiline property of the RegExp object is set, ^ also matches the position after ' \ n ' or ' \ R '.
$ Matches the end position of the input string. If the Multiline property of the RegExp object is set, it also matches the position before ' \ n ' or ' \ R '.
\b Matches a word boundary, which is the position between a word and a space. For example, ' er\b ' can match ' er ' in ' never ', but not ' er ' in ' verb '.
\b Matches a non-word boundary. ' er\b ' can match ' er ' in ' verb ', but cannot match ' er ' in ' Never '.
2.4. quantifiers

Explain three concepts first

  • Greed (greed) such as "*" character greedy quantifiers will first match the entire string, when trying to match, it will select as much as possible, if it fails to fall back one character, and then try to fallback the process is called backtracking, it will fall back one character at a time until a match is found or no characters can be rolled back. The consumption of resources is the largest compared to the other two greedy quantifiers.
  • Lazy (reluctantly) such as "?" The lazy quantifier is matched in another way, starting at the beginning of the target and trying to match, checking one character at a time, and looking for what it wants to match, so loop until the end of the character.
  • Possession such as "+" the Word will overwrite the object string, and then try to find a match, but it only try once, do not backtrack, it is like catching a stone, and then from the stone to pick out gold.

The commonly used quantifiers are:

character
*
+
{n} n is a non-negative integer. Matches the determined n times.
{n,} n is a non-negative integer. Match at least n times.
{n,m} m and N are non-negative integers, where n <= m. Matches at least n times and matches up to M times.

The regular expression defaults to greedy mode, as much as possible to match, if you want to make the regular expression as few matches as possible, you need to use "?".
When the "?" The matching pattern is non-greedy immediately after any other restriction (*, +,?, {n}, {n,}, {n,m}). The non-greedy pattern matches the searched string as little as possible, while the default greedy pattern matches as many of the searched strings as possible. For example, for the string "oooo", ' o+? ' will match a single "O", while ' o+ ' will match all ' o '.

character meaning
*? Repeat any number of times, but repeat as little as possible
+? Repeat 1 or more times, but repeat as little as possible
?? Repeat 0 or 1 times, but repeat as little as possible
{n,m}? Repeat N to M times, but repeat as little as possible
{N,}? Repeat more than n times, but repeat as little as possible
3. The regular expression step 3.1. Grouping and OR

In regular expressions, quantifiers only work on the previous character or character class, and if you want to make him work on a fixed string, you need to group. Grouping is actually a parenthesis in the content, you can make quantifiers in groups, such as/(Tristan) {1,3}/parentheses is a grouping.

The concept of "or" is to match a string or another string, using the "|" Representations, such as/tristan | henry/, in the case of no grouping or acting on a string, having a grouping or acting on a grouping, for example/trist (an | He) nry/.

The reverse reference, when we replace or other operations need to refer to the replacement string before, you can use the "$" symbol to get the matching grouped string. For example, replacing "2017-02-08" with "02/08/2017", the string that needs to be replaced cannot be immutable, so you need to use a reverse reference by:
‘2017-02-08‘.replace(/(\d{4})-(\d{2})-(\d{2})/g, ‘$2/$3$1‘);

Common uses for grouping are as follows:

character meaning
(exp) Match exp, and capture text into an automatically named group
(? <\name>exp) Match exp, and capture the text into a group named name
(?: EXP) Matches exp, does not capture matching text, and does not assign group numbers to this group
3.2 Outlook

The regular expression is parsed from the head of the text to the end, and the end of the text is before the direction.

The forward-looking is that when a regular expression is matched to a rule, check forward to see if the assertion is true and the looking back/post direction is reversed. JavaScript does not support looking back.
Compliant and non-conforming specific assertions become positive/positive matching and negative/negative matching.

name Regular meaning
Forward Looking exp (? =assert) The preceding regular expression matches and is followed by an assertion
Negative outlook EXP (?! Assert The preceding regular expression matches and does not follow the assertion
Forward to looking back exp (? <=assert) JavaScript does not support
Negative to looking back Exp (? JavaScript does not support
4. Regular expressions in JavaScript 4.1. Object Properties
    • Global: Whether to search globally, default to False (read only);
    • ignoreCase: Is case sensitive, default is False (read only);
    • Multiline: If multiple lines are searched, the default is False (read only);
    • LastIndex: is the next position of the last character of the current expression match, and can be used when the regular expression has global (g) enabled and the Exec matching pattern is used;
    • source: The literal string of the regular expression;
4.2. Test and Exec methods
    1. RegExp.prototype.test (str):
      • String used to detect whether a matching regular expression pattern exists in a string parameter
      • Returns true if it exists, otherwise false

Note: When a regular expression is a global search, there are multiple execution results inconsistencies, because in global search mode, each execution method starts the search at the location of the Lastindex property, and the Lastindex property changes with each search. This will result in the string remaining after matching the last match result without a match, and the test method returns false. So the test method tries not to use global search (non-global lastindex does not take effect).

    1. RegExp.prototype.exec (str):
      • Performs a search on a string using the regular expression pattern and updates the properties of the global RegExp object to reflect the matching result.
      • Returns null if there is no matching text, otherwise returns an array of results:
        - index declaration matches the position of the first character of the text
        - input to store the retrieved string

When calling the Exec () of a non-global RegExp object, the array is returned:

    • The first element is the text that matches the regular expression
      • The second element is the text (if any) that matches the first sub-expression ( a grouping ) of Regexpobject.
      • The third element is the text (if any) that matches the second sub-expression of Regexpobject ( a grouping ), and so on

When calling the global RegExp object's exec (), all matching strings can be looped.

4.3. String Object Methods
    1. String.protatype.search (REG):

      • Retrieves whether a substring is specified in a string, or if there are substrings that match a regular expression
      • method returns the index of the first matching result, no match returned-1
      • method does not perform a global match, it ignores the flag G and always retrieves from the beginning of the string
    2. String.protatype.match (REG):

      • The match () method retrieves a string to find one or more strings that match the Reg
      • Whether Reg has a flag G has a significant impact on the results

If Reg does not have a flag G, then the March () method can only perform a match in the string, and if no matching text is found, returns an array that holds information about the matching text that he found:

    • The first element holds the matching text, and the remaining elements hold the text that matches the subexpression of the regular expression
    • The array also contains two object properties:index and input

If Reg has the flag G, the March () method performs a global retrieval, finds all matching substrings in the string, returns null if no matching substring is found, or returns an array that holds all the matching substrings in the string if one or more matching substrings are found. And there is no index and input property.

    1. String.protatype.split (REG):
      Divides a string into an array of strings, using Reg as the bounds.

    2. String.protatype.replace:
      Replaces a part of a string with another string.

      • String.protatype.replace (str, REPLACESTR)
      • String.protatype.replace (Reg, REPLACESTR)
      • String.protatype.replace (Reg, function)
        Definition of parameters in function:
        1. Matched string
        2. The contents of a regular expression grouping, without grouping, without the parameter
        3. The index of the match in the string
        4. Original string
1 function (Match, Group1, group2, Group3, Index, origin) {2      return parseint (group1) + parseint (GROUP3); 3  });

 
5. Examples of regular expressions
Regular Expressions Description
/\b ([a-z]+) \1\b/gi The position in which a word appears consecutively.
/(\w+): \/\/([^/:]+) (: \d*)? ([^# ]*)/ Resolves a URL to a protocol, domain, port, and relative path.
/^ (?: chapter| section) [1-9][0-9]{0,1}$/ Position the chapter.
/<\s* (\s+) (\S[^>])? >[\s\s]<\s*\/\1\s*>/ Matches the HTML tag.
/^ (((13|15|18) [0-9]) |14[57]|17[0134678]) \d{8}$/ Match phone number
/^\w+ ((-\w+) | (. \w+))\@[a-za-z0-9]+ ((. | -) [a-za-z0-9]+]. [a-za-z0-9]+$/ Match mailbox
/(^\d{15}$) | (^\d{18}$) | (^\d{17} (\d| X|X) $)/ Match ID number

Regular Expressions (JavaScript)

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.