When writing code, it is often necessary to match a particular string, or a pattern of strings, usually with a string function or regular expression.
For regular expressions, some characters have special meanings and need to be escaped with the backslash character ' \ ' to signify their meaning. If you want to match the characters ' \ ', but write ' \\\\ ', it is very disturbing. The raw string in Python solves this problem by simply prefixing ' \ ' with the prefix ' r ', such as R ' \ n ', which means ' \ ' and ' n ' two ordinary characters instead of the original newline. The prefix ' R ' is similar to the-R (use extended regular expressions) parameter of the SED command.
The regular expression may include two parts, one is the normal character, the table itself meaning; The second is the special character, the table a kind of normal character, or the number of characters ...
The RE module provides many methods for regular matching.
Match match a regular expression pattern to the beginning of a string.
Search Search a string for the presence of a pattern.
Sub substitute occurrences of a pattern found in a string.
Subn same as sub, but also return the number of substitutions made.
Split split a string by the occurrences of a pattern.
FindAll Find all occurrences of a pattern in a string.
Finditer Return a iterator yielding a match object for each match.
Purge Clear the regular expression cache.
Escape backslash all non-alphanumerics in a string.
There is also the compile function, which is more specific and compiles the matching pattern into a regular expression object (Regexobject, _sre. Sre_pattern), and returns, the object can still use these functions. This is also explained from the side, for the RE module, there are two ways to use non-compiling and compiling, as shown below.
1.
result = Re.match (pattern, string)
2.
Prog = Re.compile (pattern)
result = Prog.match (string)
They achieve the same effect, except that the latter temporarily saves the regular expression object, which is typically higher than the case where the regular expression is used frequently in a block of code.
for match () and search (), a matching object is returned (Match object, _sre. Sre_match), which also has several methods, the following are more commonly used.
Group
Group ([Group1, ...]), str or tuple.
Return subgroup (s) of the match by indices or names.
For 0 returns the entire match.
Groups (...)
Groups ([Default=none]), tuple.
Return a tuple containing all the subgroups of the match, from 1.
The default argument is used for groups
That didn't participate in the match
End (...)
End ([group=0]), Int.
Return index of the end of the substring matched by group.
Start (...)
Start ([group=0]), Int.
Return index of the start of the substring matched by group.
At this point, the re-modular framework of the comb is so, give some examples, the above content summarized below.
1.
In []: Text = "He is carefully disguised but captured quickly by police."
In []: Re.findall (r "\w+ly", text)
OUT[24]: [' carefully ', ' quickly ']
2.
in [+]: M = Re.match (r "(\w+) (\w+)", "Isaac Newton, physicist")
in [+]: M.group (0)
OUT[26]: ' Isaac Newton '
in [+]: M.group (1)
OUT[27]: ' ISAAC '
in [+]: M.group (2)
OUT[28]: ' Newton '
In []: M.group (1, 2)
OUT[29]: (' Isaac ', ' Newton ')
3.
in [+]: Account = "Abcxyz_"
in [+]: Replace_regex = Re.compile (R ' _$ ')
In [All]: Replace_regex.sub (account[0], account)
OUT[33]: ' Abcxyza '
Regular expression use in the details there are many, here can not be summed up, the practice process slowly experience and summarize it.
Python Standard library-RE