1. If the special characters are literal in use, they need to precede them with a backslash (\) and special characters include \. ^ $ ? + * { } ( ) [ ] |
2. If you want to match any one of the character sets, you can use the character class to implement-one or more characters enclosed in square brackets. The character class is an expression, such as a regex r[ea]d that matches Red and radar (the underlined portion of the match).
3. Within a character class, except for \, special characters no longer have their special meaning, exceptions are
- If the ^ symbol is the first character of a character class, it is using its special meaning (negation), but in other cases, it is still just a literal ^ symbol
- If-represents a range of characters, if it is the first character of a character class, it represents a literal hyphen
4. Shorthand for character classes:
- . (decimal point): You can match any character except the line break, or with re. Dotall any character that is marked, or character that matches the literal meaning inside a character class
- \d: Matches a Unicode number, or with re. ASCII tagged [0-9]
- \d: Matches a Unicode non-numeric, or with re. ASCII tagged [^0-9]
- \s: Matches a Unicode blank, or with re. ASCII tagged [\t\n\r\f\v]
- \s: Matches a Unicode non-whitespace, or with re. ASCII tagged [^ \t\n\r\f\v]
- \w: Matches a Unicode word character, or with re. ASCII tagged [a-za-z0-9_]
- \w: Matches a Unicode non-word character, or with re. ASCII tagged [^a-za-z0-9_]
Python Regular Expressions