Detailed analysis of JavaScript regular expressions

Last Update:2017-12-12 Source: Internet

Author: User

Tags character classes chop strong password

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

One: Grammar rules

The 1.RegExp constructor creates a regular expression object that matches the text to a pattern .

2.REGEXP is made up of two parts

①pattern（正则表达式的文本）

　　② flags（

　　　　g:Global match; Find all matches, not stop after first match

　　　　i:Ignore case

　　　　m:MultiRow The start and end characters (^ and $) are considered to work on multiple lines (that is, they match the start and end of each line separately (by \ n or \ r), not just the beginning and end of the entire input string.

U:unicode; To treat a pattern as a sequence of Unicode sequence points

　　　　y:Viscous matching; Matches only the index indicated by the Lastindex property of this regular expression in the target string (and does not attempt to match from any subsequent index)

　　　）

Eg:/^[a-za-z]+\. [A-za-z]+\. (cn|com|info|top)/gi domain name matching.

II: Form of Creation (two ways)

Literal Eg:var a =/^1\d{2}-\d{4}-\d{4}/g (phone number matching) 1. 　

　2. Constructor Eg:var a = new RegExp (/^1\d{2}-\d{4}-\d{4}/g)

Three: Meaning of special characters in expressions

The

Character category (Character Classes)
character	meaning
`.`	(Dot, decimal) matches any single character, except line terminator: `\n` `\r` `\u2028` or `\u2029` . In the character set, the dot (.) Lose its special meaning, and match a word pastry (.). It is important to note that the `m` multi-line (multiline) flag does not change the performance of the dot number. So in order to match the character set in multiple lines, you can use `[^]` (of course you're not going to use the old version of IE), it will match any character, including line breaks. For example, `/.y/` match "my" and "ay" in "yes", but do not match "yes".
`\d`	Match any Arabic numerals. Equivalent to `[0-9]` . For example, `/\d/` or `/[0-9]/` match "B2 is the suite number." In the ' 2 '.
`\D`	Match any character that is not an Arabic numeral. Equivalent to `[^0-9]` . For example, `/\D/` or `/[^0-9]/` match the ' B ' in "B2 is the suite number."
`\w`	Matches any alphanumeric character from the basic Latin alphabet, and also includes an underscore. Equivalent to `[A-Za-z0-9_]` . For example, match ' a ' in ' Apple ', ' 5 ' in ' `/\w/` $5.28 ' and ' 3 ' in ' 3D '.
`\W`	Matches any character that is not a word (alphanumeric underscore) in the basic Latin alphabet. Equivalent to `[^A-Za-z0-9_]` . For example, `/\W/` or `/[^A-Za-z0-9_]/` match '% ' in ' 50% '.
`\s`	Matches a white space character, including spaces, tabs, page breaks, newline characters, and other Unicode spaces. Equivalent to`[ \f\n\r\t\v?\u00a0\u1680?\u180e\u2000?\u2001\u2002?\u2003\u2004? \u2005\u2006?\u2007\u2008?\u2009\u200a?\u2028\u2029??\u202f\u205f? \u3000]。` For example, `/\s\w*/` match ' bar ' in ' foo bar '.
`\S`	Matches a non-whitespace character. Equivalent to `[^ \f\n\r\t\v?\u00a0\u1680?\u180e\u2000?\u2001\u2002?\u2003\u2004? \u2005\u2006?\u2007\u2008?\u2009\u200a?\u2028\u2029?\u202f\u205f?\u3000]` . For example, `/\S\w*/` match ' foo ' in ' foo bar '.
`\t`	Match a horizontal tab (tab)
`\r`	Match a return character (carriage return)
`\n`	Match a line break (linefeed)
`\v`	Match a vertical tab (vertical tab)
`\f`	Match a page break (form-feed)
`[\b]`	Match a BACKSPACE (BACKSPACE) (don't `\b` confuse with)
`\0`	Matches a NUL character. Do not follow the decimal point behind this.
`\cX`	`X`is a A-Z letter. Matches a control character in a string. For example, `/\cM/` match the control-m in the string.
`\xhh`	Matches a character encoded as `hh` (two hexadecimal digits).
`\uhhhh`	Matches a character with a Unicode value `hhhh` of (four hexadecimal digits).
`\`	For those characters that are usually considered literal, it is said that the next character is of special use and will not be interpreted in the literal sense. For example, `/b/` match the character ' B '. Precede b with a backslash, which is used `/\b/` , the character becomes special to match a word boundary. Or For those characters that are usually treated in a special way, the next character is not a special purpose and will be interpreted in the literal sense. For example, * is a special character that matches a character 0 or more times, such as `/a/` means 0 or more "a". To match the literal meaning, precede `` it with a backslash, for example, `/a\/` match ' A '.
character set (character sets)
character	meaning
`[xyz]`	A character set, also known as a character group. Matches any one of the characters in the collection. You can use the hyphen '-' to specify a range. For example, [ABCD] is equivalent to [a-d], which matches ' C ' in ' B ' and ' chop ' in ' brisket '.
`[^xyz]`	An inverse or supplemental character set, also known as a group of antisense characters. In other words, it matches any character that is not in parentheses. You can also specify a range of characters by using the hyphen '-'. For example, `[^ABC]` equivalent to `[^a-c]. The` first matches the ' H ' in ' o ' and ' chop ' in ' bacon '.
boundary (boundaries)
character	meaning
`^`	Match input starts. If the multiline (multiline) flag is set to True, the character also matches the beginning of a line break character. For example, `/^a/` does not match "a" in "an A", but matches "a" in "an".
`$`	Match input end. If the multiline (multiline) flag is set to True, the character also matches the previous end of a line break character. For example, `/t$/` does not match "T" in "eater", but matches "T" in "eat".
`\b`	Matches a 0 wide word boundary (zero-width word boundary), such as between a letter and a space. (Do not and `[\b]` obfuscation) For example, `/\bno/` matches "no" in Noon, `/ly\b/` matches "ly" in "possibly yesterday."
`\b`	Matches a 0 wide non-word boundary (zero-width non-word boundary), such as between two letters or two spaces. For example, `/\bon/` matches "on" in "at noon", `/ye\b/` matches "ye" in "possibly yesterday."
Grouping (Grouping) and reverse referencing (back references)
character	meaning
`(x)`	Matches `x` and captures the match. This is called the capture bracket (capturing parentheses). For example, `/(foo)/` match and capture "foo" in "Foo bar." The matched substring can be found in the elements of the resulting array `[1], ..., [n]` , or in the properties of the object being defined `RegExp` `$1, ..., $9` . The capture group (capturing groups) has performance penalties. If you do not need to access the matched substrings again, it is best to use the non-capturing parentheses (non-capturing parentheses), see below.
`\n`	`n`is a positive integer. A reverse reference (back reference) that points to the substring in the regular expression that matches the nth parenthesis (number left). For example, `/apple(,)\sorange\1/` match "Apple,orange," in "Apple, orange, cherry, peach." A more comprehensive example is below the table.
`(?:x)`	The match `x` does not capture the match. This is known as a non-capturing parenthesis (non-capturing parentheses). The match cannot be accessed again from the element of the resulting array or from the properties of the `[1], ..., [n]` object that has been defined `RegExp` `$1, ..., $9` .
Quantity Words (quantifiers)
character	meaning
`x*`	Matches the preceding pattern x 0 or more times. For example, `/bo*/` match "boooo" in "a Ghost booooed", "B" in "A bird warbled", but do not match "a goat grunted".
`x+`	Matches the preceding pattern x 1 or more times. Equivalent to `{1,}` . For example, `/a+/` match "a" in "Candy", "a" in "Caaaaaaandy".
`x*?` `x+?`	Match the preceding pattern xas above * and +, however the match is the smallest possible match. For example, Match ' foo ' `/".?"/` in ' foo ', ' Bar ', and ' no ' after match ' foo ' "Bar".
`x?`	Match the previous pattern x 0 or 1 times. For example, `/e?le?/` match "El" in "Angel", "le" in "angle". If `*` `+` `?` the symbol (?) is immediately followed by a quantity word,, or any of the `{}` following, the quantity Word becomes non-greedy (non-greedy), that is, the number of matches is minimized. Conversely, by default, it is greedy (greedy), which is the maximum number of matches. When used with forward assertions (lookahead assertions), see the table `(?=)、` `(?!)` and `(?:)` the instructions.
`x(?=y)`	Matches only when `x` it is followed closely. `y` `x` For example, `/Jack(?=Sprat)/` it will only match when ' Jack ' is followed by ' Sprat '. `/Jack(?=Sprat\|Frost)/`it only matches the ' sprat ' or ' Frost ' that follows ' Jack '. However, ' sprat ' or ' Frost ' are not part of the matching results.
`x(?!y)`	Matches only if `x` they are not followed closely `y` `x` . For example, `/\d+(?!\.)/` the number will only match if a number is not followed by a decimal point. `/\d+(?!\.)/.exec("3.141")`Match 141 instead of 3.141.
`x\|y`	Match `x` or`y` For example, match "green" in `/green\|red/` "Green apple", "Red apple."
`x{n}`	`n`is a positive integer. The preceding pattern x appears consecutively for n times. For example, " `/a{2}/` a" in "Candy," is not matched, but matches the two "a" in "Caandy," and matches the first two "a" in "Caaandy."
`x{n,}`	`n`is a positive integer. The preceding pattern x continuously appears at least n times when matched. For example, " `/a{2,}/` a" in "candy" is not matched, but matches all "a" in "Caandy" and "Caaaaaaandy."
`x{n,m}`	`n`And is `m` a positive integer. The preceding pattern x continuously appears at least n times, and matches at most m times. For example, `/a{1,3}/` do not match "Cndy", Match "a" in "Candy,", "Caandy," in two "a", matching the first three "a" in "Caaaaaaandy". Note that when "Caaaaaaandy" is matched, even if the original string has more "a", the match is "AAA".
Assertion (assertions)
character	meaning
`x(?=y)`	Matches only the x that is followed by Y. For example, `/Jack(?=Sprat)/` if "Jack" is followed by sprat, then it matches. `/Jack(?=Sprat\|Frost)/`, if "Jack" followed by "Sprat" or "Frost", then match it. However, both "Sprat" and "Frost" do not appear in the matching results.
`x(?!y)`	Matches only x that is not followed by Y. For example, `/\d+(?!\.)/` only numbers that are not followed by a dot (.) are matched. `/\d+(?!\.)/.exec(‘3.141‘)` Match "141" instead of "3.141

Four: Attributes and built-in methods for regular objects

1: Properties

Note that RegExp several properties of an object have both a full long property name and a short attribute name for the corresponding class Perl. All two properties have the same value. The regular syntax for JavaScript is Perl-based.

　　RegExp.prototype.constructor

　　The constructor that creates the regular object.

　　RegExp.prototype.global

Whether to turn on global matching, which is to match all possible matches in the target string, rather than just the first match.

　　RegExp.prototype.ignoreCase

Whether to ignore the case of the character when matching the string.

　　RegExp.prototype.lastIndex

The string index position at which to start the next match.

　　RegExp.prototype.multiline

Whether to turn on multi-line pattern matching (affects the behavior of ^ and $).

　　RegExp.prototype.source

The source-mode text of the regular object.

　　RegExp.prototype.sticky

Whether to turn on sticky matching.

　　Regexp.length

　　The regexp.length value is 2.

2: Method

　　RegExp.prototype.exec()

Performs a regular match operation in the target string.

　　RegExp.prototype.test()

Tests whether the current regular matches the target string.

　　RegExp.prototype.toSource()

Returns a string whose value is the literal form of the regular object. Overrides the Object.prototype.toSource method.

　　RegExp.prototype.toString()

Returns a string whose value is the literal form of the regular object. Cover the

　　Object.prototype.toString()Method.

Four: the regular instance

1: Mailbox match:/^ ([a-za-z]|\d) *@[a-za-z]+\. [A-za-z]+$/gi

2: Phone Number:/^1\d{2}-\d{4}-\d{4}$/g

3: Link:/^ (HTTP|HTTPS): \\\\[a-za-z]+\. ([a-za-z]|\d) +\. (cn|com): \d*\\.*$/g

4: Date format:/^\d{4}-\d{1,2}-\d{1,2}$/g

5: Strong password (must contain a combination of uppercase and lowercase letters and numbers, cannot use special characters, length between 8-10):/^ (? =.*\d) (? =.*[a-z]) (? =.*[a-z]). {8,10}$/

Reference: HTTPS://DEVELOPER.MOZILLA.ORG/ZH-CN/DOCS/WEB/JAVASCRIPT/REFERENCE/GLOBAL_OBJECTS/REGEXP

Detailed analysis of JavaScript regular expressions

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More