JavaScript Regular Expressions

Source: Internet
Author: User
Tags alphabetic character character classes

A regular expression is an object that describes a character pattern, and the RegExp class of JavaScript represents a regular expression, and both string and RegExp define methods, which use regular expressions for powerful pattern matching and text retrieval and substitution functions.

Regular expressions in JavaScript are represented by the RegExp object, and you can use the RegExp () constructor to create RegExp objects, but the RegExp objects are created more by a special direct amount syntax, which is defined as a character that is contained between a pair of slashes , the following two usages are equivalent and are used to match all strings ending with the letter "s":

var/s$/;varnewRegExp(“s$”);

All letters and numbers in a regular expression are matched by literal meaning, and JavaScript regular expression syntax also supports non-alphabetic character matching, which needs to be escaped through a backslash \ as a prefix, as shown by the direct magnitude character in the regular expression (many punctuation marks have special meanings, such as ^ $ . * + ? = ! : | / \ ( ) [ ] { }):

字母和数字字符:本身\o:NUL字符\t:制表符\n:换行符\v:垂直制表符\f:换页符\r:回车符\xnn:由十六进制数nn指定的拉丁字符\unnnn:有十六进制数nnnn指定的Unicode字符\cX:控制字符^X

The character class is formed by placing the direct magnitude character in square brackets, and a character class can match any character it contains, and the following is a list of the character classes of the regular expression:

[...]:方括号内的任意字符,如[abc]表示a、b和c中的任意一个字符,[a-z]表示任意一个小写字母,[a-zA-Z0-9]表示任意一个字母或者数字。[^...]:不在方括号内的任意字符,如[^abc]表示除了a、b和c的其它任意一个字符。.:除换行符和其他Unicode行终止符之外的任意字符\w:任何ASCII字符组成的单词\W:任何不是ASCII字符组成的单词`这里写代码片`\s:任何Unicode空白符\S:任何非Unicode空白符的字符\d:任何ASCII数字\D:任何ASCII数字之外的任何字符[\b]:退格直接量(特例)

If more than one character in a regular expression repeats multiple times, one is inevitably cumbersome to write, and the regular expression supports the form of a repeating representation of a character, with the following syntax:

{n, m}:匹配前一项至少n次,但不能超过m次{n, }:匹配前一项n次或者更多次{n}:匹配前一项n次?:匹配前一项0次或者1次+:匹配前一项1次或者多次*:匹配前一项0次或多次

For example:

/\d{2,4}/:匹配2~4个数字/\w{3}\d?/:精确匹配3个ASCII字符和1个可选的数字/\s+java\s+/:匹配前后带有一个或多个空格的字符串“java”/[^(]*/:匹配一个或多个非左括号的字符

The matching repetition characters of the regular expressions listed above are matched as much as possible, and the subsequent regular expressions are allowed to continue to match, the pattern is greedy, and of course there is a non-greedy pattern, that is, as few matches as possible, followed by a question mark after repeating the matching character.

For example:/a+/and/a+?/for "AAA", the former greedy match three letters, the latter non-greedy only match the first letter. The result of using non-greedy matching patterns may be inconsistent with expectations, such as/a+b/and/a+?b/for "Aaab", they all match the entire string, the former greedy pattern is indisputable, but the latter is not greedy mode how can it be so, This is because the pattern match of the regular expression always looks for the first possible match in the string, since the match starts with the first character of the string, and therefore does not consider a shorter match in its substring.

The syntax for regular expressions also includes special characters that specify selection, sub-expression grouping, and referencing the previous subexpression, as follows:

|:选择,匹配的是该符号左边的子表达式或右边的子表达式(...):组合,将几个项组合为一个单元,这个单元可通过“*”、“+”、“?”和“|”等符号加以修饰,而且可以记住和这个组合相匹配的字符串以供此后的引用使用。(?:...):只组合,把项组合到一个单元,但不记忆与该组相匹配的字符。\n:和第n个分组第一次匹配的字符相匹配,组是圆括号内的子表达式,也有可能是嵌套的,组索引是从左到右的左括号数,“(?:”形式的分组不进行数字编码。

For example:

/ab|cd|ef/ : Indicates that the string ab can be matched, the string CD can also be matched, and the string EF can be matched. /\d{3}| [A-z] {4}/: Indicates that the match is three digits or four lowercase letters. /a|ab/: For the string "AB", match only a instead of all AB, because the selection starts matching from left to right until a match is found, and if the left selection matches successfully, the right match is ignored. Even if the match on the right produces a better matching result. /java (script)?/: You can match the string Java, then you can have script or not, using parentheses to group. /(AB|CD) +|ef/: You can match the string ef, or you can match one or more repetitions of the string ab or CD. /([Jj]ava] ([ss]cript)?) \sis\s (fun\w*)/: Nested form, where ([Ss]cript)?) Can be replaced with \2 . /(Jj]ava (?: [Ss]cript)?) \sis\s (fun\w*)/: Nested form, where (?: [Ss]cript)?) Only for grouping, \2  the reference becomes (fun\w*). /([' "]) [^ ']]*\1/: Both ends of the string are a pair of single or double quotes, and the middle is any non-single, double-quoted character, if written /([' "]) [^ '"]*[' "]/ is illogical because both ends of the string cannot guarantee the pairing of single and double quotes. 

The regular expression can also specify a matching location, as follows:

^:匹配字符串的开头,在多行检索时,匹配一行的开头。$:匹配字符串的结尾,在多行检索时,匹配一行的结尾。\b:匹配一个单词的边界。\B:匹配非单词边界的位置。(?=p):零宽正向先行断言,要求接下来的字符都与p匹配,但不能包括匹配p的那些字符,并不是真正意义上的匹配。(?!p):零宽负向先行断言,要求接下来的字符不与p匹配。

For example:

/^JavaScript$/:匹配单词JavaScript。/\bJava\b/:匹配单词Java本身。/\B[Ss]cript/:与JavaScript和postscript匹配,但不与script和Scripting匹配。/[Jj]ava([Ss]cript)?(?=\:)/:可以匹配“JavaScriptin a Nutshell”中的“Java”,因为后者没有冒号。/Java(?!Script)([A-Z]\w*)/:可以匹配Java后跟随一个大写字母和任意多个ASCII字符,但Java后面不能跟随Script。

The regular expression can also specify a modifier, which is placed after the second "/", with three modifiers.

i:执行不区分大小写的匹配。g:执行一个全局匹配,即找到所有的匹配,而不是在找到第一个就停止。m:多行模式匹配,^匹配一行的开头和字符串的开头,$匹配行的结尾和字符串的结尾。

For example:

/java$/im:可以匹配“java”,也可以匹配“Java\nis fun”。/\bjava\b/gi:不区分大小写匹配所有单词“java”。

Using regular expressions with string--

The syntax for so many regular expressions is described above, so how do you use them? The string object provides 4 methods:search(),replace(),match(), and split(), which feature different features, For detailed usage, you can see the string object, which describes only one interesting use of replace ().

var quote = /``([^``]*)``/g;text.replace(quote, ‘ “$1” ‘);

In the replace () usage above, the key is $ $, which simply replaces the quotation marks with Chinese half-width quotes, and the contents of the quotation marks remain unchanged.

Using regular expressions with RegExp objects--

The constructor of the RegExp object has two parameters, the first parameter is the regular expression, which is the part of the regular expression direct constant two slash, but to use the "\" character as the prefix of the escape character, the second parameter is an optional modifier, as shown in the following example:

varnewRegExp(“\\d{5}”, “g”);

One of the benefits of using regexp is that you can dynamically create regular expressions, rather than writing dead in code like direct constants, and have more flexibility.

The RegExp object contains 5 properties:source, read-only string, regular expression text,global, read-only Boolean value, indicating whether the regular expression has a modifier g;ignoreCase, read-only Boolean value, Indicates whether the regular expression has a modifier i;multiline, a read-only Boolean value that indicates whether the regular expression has a modifier m;LastIndex, a readable writable integer, and if the matching pattern has a G modifier, This property is stored at the beginning of the next retrieval of the entire string.

The RegExp object contains 2 methods,exec () , and test (). The former is similar to String.match (), the argument is a string, the match fails to return NULL, the array is returned when the match succeeds, and the complete information about this match is provided. The latter is simpler, and the parameter is also a string, which returns true if a matching result of the regular expression is included.

Copyright NOTICE: This article for Bo Master original article, without Bo Master permission not reproduced.

JavaScript Regular Expressions

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.