JavaScript Regular Expressions
正则表达式Use to match a certain pattern in the string character combination of patterns, in the actual development of the use of frequency or relatively high, strong regular expression, can greatly facilitate our operation of the string, improve our development efficiency. This article will analyze the regular expressions in JavaScript.
# # # Regular Expression creation
JavaScript supports regular expressions through built-in object REGEXP, and there are two ways to instantiate an RegExp object.
- Literal form
let reg = /a/gRepresents a global match for a letter A
- form of the constructor function
let reg = new RegExp(‘a‘[,修饰符])Globally matches a letter A
- Object Properties
Global:g, full-text search, default false
IGNORECASE:I, ignoring case search, default false
MULTILINE:M, multi-line search, default false
LastIndex: The next position of the last character of the currently matched content
Source: literal string of regular expression
# # # Meta characters
In regular expressions, there is a character that denotes the meaning of the character itself, such as a a character that represents a a letter, and a class of non-alphabetic characters that represent some special meanings, mainly with the following meta-characters
()[]{}?+*^$.|
Metacharacters may have different meanings under different scenarios, such as ^, where a regular expression begins to represent a boundary, and in [] indicates an inverse
# # # Characters with special meanings
| character |
meaning |
\t |
Horizontal tab |
\v |
Vertical tab |
\r |
Carriage return character |
\n |
Line break |
\0 |
Null character |
\f |
Page break |
\cX |
control characters corresponding to x (e.g. ctrl+X ) |
# # # character class
Simple class
In general, a literal character of a regular expression corresponds to a character, such as matching a character, to match a character, /a/ a /ab/ ab But we can enclose a set of characters in [] to represent a character as a whole, representing the characters in the match [] 一个 , such as [ABC] can match a single A or B or C character
---
javascript console.log(/[abc]/g.test(‘a‘)) //ture console.log(/[abc]/g.test(‘b‘)) //ture console.log(/[abc]/g.test(‘c‘)) //ture
Negative to Class
is still in [], plus a ^, which means that the match cannot be any one of the characters in []
Such as: /[^abc]/ indicates that the matched string cannot be any of the characters in ABC
---
javascript console.log(/[^abc]/g.test(‘a‘)) //false console.log(/[^abc]/g.test(‘b‘)) //false console.log(/[^abc]/g.test(‘d‘)) //true console.log(/[^abc]/g.test(‘da‘)) //true
Scope class
If we want to match any of these strings, we might write /[abcdefg]/ , but it's too cumbersome and very readable, for this sort of ABCDEFG, we can use the scope class to match, that is, in the middle - , the above example can be rewritten as/[a-g]/
Pre-defined Classes
For some common matches, we can use some special characters to simplify our writing
| character |
equivalent to |
meaning |
. |
[^\r\n] |
Any character other than line break and carriage return |
\d |
[0-9] |
numeric characters |
\D |
[^0-9] |
Non-numeric characters |
\s |
[\t\n\r\f\xOB] |
White space characters |
\S |
[^\t\n\r\f\xOB] |
Non-whitespace characters |
\w |
[a-z0-9_A-Z] |
Word characters (so the characters, numbers and _) |
\W |
[^a-z0-9_A-Z] |
Non-word characters |
# # # Quantifier
We already know that the above-mentioned character classes are just a one-to-two match, that is, just matching a single character, but if we are going to match a string, such as matching 50 word characters, then we need to write the \w 50 times, this is not the rhythm of the dead force. Fortunately, we have quantifiers to simplify this operation.
- Simple quantifier
| rules |
|
? |
|
+ |
|
* |
appears 0 or more times (can or does not appear any time) |
{n} |
|
{n,m} |
at least n times, but no more than m times (greater than or equal to n, less than or equal to M) |
{n,} |
at least n times |
Greedy mode and non-greedy mode
First of all, let's think about this example, ‘abcdefg‘.replace(/\w{4,7}/g,‘x‘) /\w{4,7}/g either match abcd or match, and that abcdefg part will be replaced, which leads to what we call greedy and non-greedy patterns.
- Greedy mode
- The greedy pattern, as its name implies, is to match as much as possible.
- Simple quantifier matching is greedy mode.
- We are not difficult to get, the above example is greedy mode, will match as many as possible, which
abcdefg will be replaced byx
- Non-greedy mode
- Non-greedy mode is the opposite of greedy mode, which is to match as few as possible
- Implementation, after the simple quantifier is added
?
- The above example is rewritten as a non-greedy mode
‘abcdefg‘.replace(/\w{4,7}?/g,‘x‘) and abcd will be replaced x with the resultingxefg
# # Group
If you need a victorvictor matching pattern like that, what should we write about, like victor{2} ? This means that r there are 2 of times that are obviously not up to our goal. This is what we can do () by victor enclosing it, making it into a whole, that is, grouping. That /(victor){2}/ we can achieve our goal.
# # # Reverse Reference
If we want to change yyyy-mm-dd the date format mm/dd/yyyy , what should we do? This will require us to capture the matching date and then regroup.
- We can get a match to the character by a reverse reference
分组 , and each reverse reference has a $1 similar $2 number to indicate
- The above example we can write
‘2018-05-04‘.replace(/([\d]{4})-([\d]{2})-([\d]{2})/g,‘$2/$3/$1‘)
- If it's a grouping, how does a reverse reference mean it?
/(victor (and victor){2}){3}/
$1:group#1 $2:group#2
# # # non-capturing grouping
As can be seen from the above, each grouping will correspond to a reverse reference, but what if we need a group that does not correspond to the reverse grouping?
?:You can create a non-capturing group by adding it in front of the group.
/([\d]{4})-[\d]{2}-([\d]{2})/
# # # Candidate
We can insert it into a regular expression | and split it into multiple candidates
/Kobe|James/Equivalent Kobe orJames
/Ko(be|Ja)mes/
Prospect
After we have matched a string by a regular matching pattern, we can also look through the string to see if the next string matches our requirements.
| Grammar |
name |
meaning |
(?=exp) |
Forward Looking |
Check that the next character matches exp match |
(?!xp) |
Negative outlook |
Check that the next character does not conform to exp match |
/victor(?=\d{4})/
javascript console.log(/victor(?=\d{4})/.test(‘victor12‘)) //false
/victor(?!\d{4})/
- Foresight is not captured, that is, there is
(?=exp) no corresponding reverse reference
Boundary
| Grammar |
meaning |
^ |
Not in [], to ... Beginning |
$ |
To... End |
\b |
Word boundaries |
\B |
Non-word boundary |
Common methods of regular expressions
RegExp.prototype.test (str)
Used to test whether the string being tested matches the regular expression. Returns true on match, otherwise false
- Non-global calls that do not contain G modifiers
javascript let reg1 = /\w/; let str = ‘ab‘ console.log(reg1.test(str),reg1.lastIndex) //true 0 console.log(reg1.test(str),reg1.lastIndex) //true 0 console.log(reg1.test(str),reg1.lastIndex) //true 0 console.log(reg1.test(str),reg1.lastIndex) //true 0
Non-global mode, which lastIndex has been 0, does not take effect
- Global Call with G modifier
javascript let reg1 = /\w/g; let str = ‘ab‘ console.log(reg1.test(str),reg1.lastIndex) //true 1 console.log(reg1.test(str),reg1.lastIndex) //true 2 console.log(reg1.test(str),reg1.lastIndex) //false 0 console.log(reg1.test(str),reg1.lastIndex) //true 1 console.log(reg1.test(str),reg1.lastIndex) //true 2 console.log(reg1.test(str),reg1.lastIndex) //false 0
As above, in the global mode, each time test() you execute, you will change the position of lastindex, and the next time you execute the test function, look for it from the lastindex position.
recommendation : Just using Test to detect if a string contains a literal, the regular expression does not have the G modifier
RegExp.prototype.exec (str)
- Searches the string using regular expressions and updates the properties of the global RegExp object to reflect the matching result
- Returns null if there is no match to the text, otherwise returns an array that contains two properties:
- The index declaration matches the position of the first character of the text
- Input holds the retrieved string
Non-global calls that do not contain G modifiers
returns an array when calling the exec () of a non-global RegExp object
- The first element is the text that matches the regular expression
- The second element is the text that matches the first grouping (if any)
- The third element is the text that matches the second grouping (if any)
- ...
LetReg1= /\w/; LetStr= ' $ab '; LetTemp= REG1.exec(str); Console.Log(Temp,Temp.Index,Temp.input,REG1.LastIndex)//[' a '] 1 "$ab" 0Temp= REG1.exec(str); Console.Log(Temp,Temp.Index,Temp.input,REG1.LastIndex)//[' a '] 1 "$ab" 0Temp= REG1.exec(str); Console.Log(Temp,Temp.Index,Temp.input,REG1.LastIndex)//[' a '] 1 "$ab" 0Temp= REG1.exec(str); Console.Log(Temp,Temp.Index,Temp.input,REG1.LastIndex)//[' a '] 1 "$ab" 0
Non-global mode, the results of each execution are the same, only the first match is found, lastIndex has been 0, does not take effect
Global Call with G modifier
javascript let reg1 = /\w/g; let str = ‘$ab‘; let temp = reg1.exec(str); console.log(temp,temp.index,temp.input,reg1.lastIndex) //[‘a‘] 1 "$ab" 2 temp = reg1.exec(str); console.log(temp,temp.index,temp.input,reg1.lastIndex) //[‘b‘] 2 "$ab" 3 temp = reg1.exec(str); console.log(temp,reg1.lastIndex) //null 0 temp = reg1.exec(str); console.log(temp,temp.index,temp.input,reg1.lastIndex) //[‘a‘] 1 "$ab" 2
As above, in the global mode, each time the execution exec() , will change the location of the lastindex, the next time you execute the EXEC function, starting from the position of lastindex to find
If you think this article is helpful to you, please scan the QR code below to support me. Your support is the driving force of my continuous efforts ^_^ __
JavaScript Regular Expressions