definition
There are two ways to define a regular expression in JavaScript.
1.REGEXP Constructors
var pattern = new RegExp ("[Bc]at", "I");
It receives two parameters: one is the string pattern to match, and the other is the optional flag string.
2. Literal
var pattern =/[bc]at/i;
The matching pattern of regular expressions supports three flags strings:
- g: Global search mode, which will be applied to all strings, instead of searching for the first match to stop the search;
- I: ingore case, ignoring the capitalization of letters, i.e. ignoring patterns and string case when determining matches;
- m: Multiple lines, multiline mode, that is, when a search reaches the end of a line of text, it continues to find the next row for matches.
The difference between these two methods of creating a regular expression is that regular expression literals always share the same regexp instance, and every new RegExp instance created with the constructor is a new instance.
Metacharacters
Metacharacters are characters with special meanings, and the meta-characters of regular expressions mainly include:
( [ { \ ^ $ | ) ? * + .
Meta-characters have different meanings in different combinations.
Pre-defined special characters
Character class Simple Class
In general, regular expressions a character corresponds to a character string, but we can use [] to construct a simple class that represents a class of characters that conform to a particular feature. For example:
[ABC] can match A, B, C, or any combination of the characters in square brackets.
Reverse class
Since [] it is possible to construct a class, it is naturally associated with a class that does not contain the bracketed content, which is called a reverse class, such as [^ABC] to match a character that is not a or B or C.
Scope class
Sometimes one character match is too troublesome and the matching type is the same, we can use the "-" connector to represent the content between a closed interval, such as matching all lowercase letters can be used [A-z], as follows:
Matching all 0 to 9 arbitrary numbers can be expressed using [0-9]:
Pre-defined Classes
For several of the classes we created, the regular expression gives us several common predefined classes to match common characters, as follows:
Character |
Equivalence class |
Meaning |
. |
[^\n\r] |
Match all characters except carriage return and line break |
\d |
[0-9] |
numeric characters |
\d |
[^0-9] |
Non-numeric characters |
\s |
[\t\n\x0b\f\r] |
White space characters |
\s |
[^\t\n\x0b\f\r] |
Non-whitespace characters |
\w |
[A-za-z_0-9] |
Word characters (Letters, numbers, and underscores) |
\w |
[^a-za-z_0-9] |
Non-word characters |
Quantifiers
The above method match characters are one-to-one matches, if a character successive occurrences in accordance with the above method will be very troublesome, so we think there is no other way to directly match the repeated occurrences of the characters. The regular expression provides us with some quantifiers, as follows:
Character |
Meaning |
? |
Occurs 0 times or once (up to one time) |
+ |
appear one or more times (at least once) |
* |
Occurs 0 or more times (any time) |
N |
Appears n times |
{N,m} |
Occurs N to M times |
{N,} |
appears at least n times |
Greedy mode and non-greedy mode
For {n,m} This kind of match way, whether match n or match m? This involves the problem of matching patterns. By default, quantifiers are as many matching characters as possible, known as greedy patterns, such as:
var num = ' 123456789 '; Num.match (/\d{2,4}/g); [1234], [5678], [9]
and greedy mode for the non-greedy mode, only need to add "?" after the quantifier. Can, for example {n,m}?, is matched by the fewest characters, as follows:
var num = ' 123456789 '; Num.match (/\d{2,4}?/g); [12], [34], [56], [78], [9]
Group
Quantifier can only be a single character match multiple times, if we want to match a certain set of characters multiple times? Regular expression parentheses can define a string as a whole as a group.
We want to match Apple this word appears 4 times can match (Apple) {4}, as follows:
If you want to match Apple or Orange 4 times, you can insert the pipe symbol "|", for example:
(Apple|orange) {4}
If multiple parentheses are present in a regular expression that uses grouping, then the match results are grouped and numbered, for example:
(apple) \d+ (orange)
If we don't want to capture some of the groupings, just precede the parentheses with a question mark and a colon, for example:
(?: Apple) \d+ (orange)
Boundary
Regular expressions also provide us with several commonly used boundary-matching characters, such as:
Character |
Meaning |
^ |
Start with XX |
$ |
End With XX |
\b |
Word boundaries, meaning characters other than [a-za-z_0-9] |
\b |
Non-word boundary |
Where the word boundary matches a position where the side of the position is the character that makes up the word, but the other side is a non-word character, the start or end position of the string.
Prospect
The preview is used to match the next occurrence or not of a particular character set.
An expression |
Meaning |
EXP1 (? =exp2) |
The match is followed by the Exp2 exp1. |
EXP1 (?! EXP2) |
Match the EXP1 that are not exp2 behind |
See an example:
Apple (? =orange)
(/apple (? =orange)/). Test (' appleorange123 '); True (/apple (? =orange)/). Test (' applepear345 '); False
Let's look at another example:
Apple (?! Orange
(/apple?! Orange)/). Test (' appleorange123 '); False (/apple (?! Orange)/). Test (' applepear345 '); True
JavaScript Regular Expression syntax