This article describes the JavaScript regular expression definition (syntax). Share to everyone for your reference, specific as follows:
2 ways to define regular expressions: One is to call RegExp () directly, and the second is to define directly by literal quantities, i.e. var re =/regular rule/;
The essence of 2 definition methods is to invoke the RegExp () method
When you call the same piece of regular code, ECMASCRIPT3 and ECMAScript5 behave differently.
function Reg () {
var re =/\sjavascript/;
return re;
}
Call the Reg () method in ECMAScript3 and ECMASCRIPT5, respectively, multiple times
In ECMAScript3, the same RegExp object is invoked, and in ECMAScript5, a different RegExp object is invoked because a new EXCMAScript5 object is generated each time it is executed in the RegExp
So in the ECMASCRIPT3 can cause the hidden trouble of the program, because as long as the object in one place to modify, all the calls to this object will change.
1. Direct measure character
In regular, you will normally match the characters directly, such as
/javascript/
Will directly match the character JavaScript
Non-alphabetic character matching is also supported, such as:
\o nul character (\u0000)
\ t Tab (\u0009)
\ n line feed (\u000a)
\v vertical tab (\U000B)
\f Page Feed (\u000c)
\ r return character (\u000d)
\xnn the Latin character specified by the hexadecimal number NN, for example, \x0a equivalent to \ n
\uxxxx Unicode characters specified by the hexadecimal number xxxx, such as \u0009 equivalent to \ t
\CX control character ^x, for example, \CJ equivalent to a newline character \ n
In regular expressions, there are some punctuation marks with special meanings, and they need ' \ ' to escape
^$.*+?=!:| \/()[]{}
2. Character class
[...] Any character within the square brackets
[^...] Any character not in square brackets
. Any character
\w any ASCII-character word, equivalent to [a-za-z0-9]
\w any word that is not an appropriate ASCII character, equivalent to [^a-za-z0-9]
\s any Unicode whitespace characters
\s any characters that are not Unicode whitespace, note that \w and \s are not the same
\d any ASCII value, equivalent to [0-9]
\d any character other than the ASCII number, which is equivalent to [^0-9]
[\b] backspace direct amount (special case)
3. Repetition (frequency)
? 0 or 1 times
+ 1 or more times
* Any time
{n} n times
{m,n} at least m times, up to N times
{n,} n times or more n times
The default is the greedy match.
If [a+b+] to match Aaabb, it will not match AB and AAB, and will only match Aaabb
[a+?b+?] This will match Aaab Why does this difference occur?
A: + is to let the positive and not greedy match, then B here will only match a B, then why a will match 3? This is because pattern matching of regular expressions always looks for the first possible match in the string.
4. Options | grouping | references
| Used to delimit selectable characters, such as [AB|CD], he can match ab or match CDs, note : The selection of the attempt to match the order is left → right, so [A|ab], when a match passed, it does not match AB, even if AB is a better match
() 1. A separate entry is a subexpression/java (script)?/can match the JavaScript and Java----------------------the expression of a child *? etc operation
2. An expression that can refer to the preceding parenthesis ([' "]) after a child pattern is defined in the complete pattern [a-z]\1/\1 refers to the expression in the first parenthesis, so it references [']
3. Refer to the previous subexpression note :/[' "][a-z]['"]/this regular meaning is single or double quotes plus a lowercase letter plus a single or double quotation mark, and the single double quotation mark is not a match if you want to match it can be written [([']]) [A-Z ]\1]
\ Plus numbers can refer to the expressions in the preceding parentheses
5. Establish a matching position (anchor point)
^ matches the beginning of a string, in multiple-row retrieval, matching the beginning of a line
$ matches the end of a string, in multiple-row retrieval, matching the end of a line
\b Matches the bounds of a word, in short, the position between the character \w and \w, or between the character \w and the beginning or end of the string.
\b Matches a non word boundary position
(? =p) 0 wide forward assertion, which requires that the next character be matched to p, but cannot include those characters that match P
(?! P) 0 Wide negative lookahead assertion, requiring that the next character not match p
6. Modifiers
Written in regular expression literal//right
I performs a case-insensitive match
G performs a global match, in short, finds all matches, instead of stopping after the first one is found
M multiple-line matching pattern, ^ matches the beginning of a line and the beginning of a string, $ matches the end of a row and the end of a string/java$/m can match Java\nfunc
Note : When the regular expression is global, each exec () and test () sets the current set of lastindex to the current position, and executes it from the lastindex position. So it's best to set the lastindex to 0 per execution.
I hope this article will help you with JavaScript programming.