For Python, learning is about learning how to use the modular re. This article will show you some of the advanced techniques that you should be able to master.
Compiling the Regular object
The Re.compile function generates a regular expression object based on a pattern string and optional flag parameters. The object has a series of methods for regular expression matching and substitution. Slightly different in usage, for example, matching a string can be as follows:
If you use compile, you will become:
Why do you use it? In fact, in order to improve the speed of the regular match, reuse the regular Expression object. Let's compare the efficiency of 2 ways:
You can see that the second way is much faster. In the actual work you will find that the more you use the compiled regular expression objects, the better the effect.
Grouping (group)
You may have seen the use of grouping the matching content:
By adding parentheses to the object you want to match, you can precisely correspond to the result. We can also do nested groupings:
Grouping can meet your needs, but sometimes it's very readable, and you can name groups:
The readability is very high now.
String matching
Students who have studied SED may have seen the following substitution usage:
This \1 represents the result of a previous positive match. The above sed is to add the brackets to the result of the match.
There are also such uses in the RE module:
It is also possible to use a named grouping:
Match nearby (look around)
The RE module also supports a nearby match, take a look at the example to understand:
Use a function when a regular match is used
Most of what we've seen before is matched by an expression, but sometimes the requirements are much more complicated, especially when it comes to substitution.
For example, a slack API can be used to get chat records, such as the following sentence:
Where the < @U1EAT8MG9 > and < @U0K1MF23Z > are 2 real users, but are encapsulated by slack, need to get the corresponding relationship through other interfaces,
The result is similar to this:
After parsing the correspondence, I also hope that the angle bracket is removed and the result of the replacement is "@xiaoming, @laolin Yes, it is."
How to achieve it with the positive?
So pattern, of course, can be a function.