Pre-defined Classes
Regular expressions provide pre-defined classes to match common character classes
character |
Equivalence class |
meaning |
. |
[^\r\n] |
All characters except carriage return and line break |
\d |
[0-9] |
numeric characters |
\d |
[^0-9] |
Non-numeric characters |
\s |
[\t\n\x0b\f\r] |
whitespace characters |
\s |
[^\t\n\x0b\f\r] |
Non-whitespace characters |
\w |
[A-za-z_0-9] |
Word characters (Letters, numbers, underscores) |
\w |
[^a-za-z_0-9] |
Non-word characters |
Using predefined classes can quickly match a target, such as: Match a ab+ number + any character, use a character class, a range class to write a large amount of code, and use a predefined class to write only:
/ab\d/
Boundary
Regular expressions also provide several commonly used boundary-matching characters
character |
meaning |
^ |
Start with XXX |
$ |
End With XXX |
\b |
Word boundaries |
\b |
Non-word boundary |
Boundaries are often useful when we want to match only some of the words in a word, not the letters in the words.
Word boundaries and non-word boundaries
Sometimes, I want to match the is word in a sentence rather than match the is letter in the word, at which point the word boundary can be used to solve the problem easily:
let text = ‘This is a boy‘let reg1 = /is/glet reg2 = /\bis\b/gtext.replace(reg1, ‘IS‘) // 没有使用单词边界\b区分,结果为:ThIS IS a boytext.replace(reg2, ‘IS‘) // 使用了单词边界进行区分,结果为:This IS a boy
And what if I just want to match the is at the end of the word? The word boundary and non-word boundaries can be mixed wit:
let text = ‘This is a boy‘let reg3 = /\Bis\b/gtext.replace(reg3, ‘IS‘) // ThIS is a boy
^ and $--start with end
Many use, we would like to match the beginning or end of the character, use ^
and $
can perfectly solve this problem:
let text = ‘@[email protected]@‘let reg1 = /@/gtext2.replace(reg1, ‘Q‘) // 没有使用^和$,匹配了所有的@,结果为:Q123QabcQlet reg2 = /^@/gtext.replace(reg2, ‘Q‘) // 使用^匹配开头的@,结果为:[email protected]@let reg3 = /@$/gtext.replace(reg3, ‘Q‘) // 使用$匹配结尾的@,结果为:@[email protected]
Tips: The actual use ^
needs to be written in front of the match, and $
need to unload the match after the
Use ^ and $ in multi-line cases
In the case of multiple rows, use the m
enter multiline mode to match the beginning and end of each line to match the character:
let text = ‘@123\[email protected]\[email protected]‘let reg1 = /^@\d/gtext.replace(reg1, ‘Q‘) /* 由于换行实际上只是一个换行符字符,在正常模式下,依然看做一段字符 结果为: Q23 @456 @789*/let reg2 = /^@\d/gmtext.replace(reg2, ‘Q‘)/* 添加了m进入多行模式: 结果为: Q23 Q56 Q89*/
JS regular expressions from getting started to being buried (4)--predefined classes and boundaries