Regular modules: Regular
All programmers have to use the regular
Reptile direction requires a strong grasp of regular expressions
Regular Expressions: It's an objective rule.
Re module, which is a tool for Python-provided action regular expressions
Regular expressions are common in all languages
The bracket is a character set in the positive, only the positive order, from small to large
. (dot) is a match for all characters except newline
\w lowercase w matches letters or numbers or underscores
\s matches any whitespace character
\ n matches a line break
\d Match all numbers
\ t matches a tab
\b Matches the end of a word
The beginning of the \^ match string is inside the character set.
\$ Match string End
\w Uppercase W matches a non-alphabetic or numeric or underscore
\id matches non-numeric
\s uppercase S matches non-whitespace characters
\w\w can match all, mutually exclusive
Match all characters as long as uppercase and lowercase
Quantifiers:
* Repeat 0 or more times
+ Repeat one or more times
? Repeat 0 or one time
{n} can put a number can put two numbers, repeat n this
{N,m} repeats n to M times
quantifier Two rule:
1. Each quantifier only controls the number of occurrences of the previous character
2. The match for this quantifier matches the greedy matching pattern to match multiple times without matching 0 times.
After the quantifier is lazy match, the greedy match pattern is changed to the non-greedy matching pattern by less matching
After the character is added quantifier, the quantifier is followed by a greeting is the control matching mode, the question mark can be used alone
^ in the character set, it's not exactly what you match.
In the regular expression, \ is meaningful, if you want to match a \w in the string, you have to add a \, if the string inside the \w also has a special meaning
So in the regular expression to add two \ Too troublesome
If you want to match this, the R function cancels all escape characters in Python, preceded by an R plus a semicolon in the string.
The nature of regular greedy matching:
Backtracking algorithm: first match backwards and then back again
Import re
Re.findall () is preceded by a regular expression, followed by a string to match, the list is returned, and the string
Returns the matching list directly, if not found is an empty list
Re.search () The same as the received results when printing to. Group (), otherwise no, match the first result found
Use if first to determine whether the empty content to use otherwise error
Re.match () ibid., Match will add a ^ at the beginning of its own judgment, and return only the first result that matches to, but match from the beginning
Three methods:
From the point of view of invocation, there is no difference, receive two parameters by location, regular expression string and string to be matched
The difference between the return value, FindAll returns the list, all matching values appear in the list, and if no empty list is returned, search returns the first match to the result if it matches the results.
If no match is reached, none is returned, the result of the match is obtained using the group method, match with search, but must match from the beginning
Re.split segmentation based on regular expression matching results
Re.sub and replace use almost, match the result to replace
Re.subn replaced several times after telling the replacement
obj = Re.compile (' \d{3} ') compiles the regular expression into a regular expression object, and the rule matches 3
ret = Obj.search (' abc123eeee ') #正则表达式对象调用search, the argument is the string to be matched
Print (Ret.group ()) result 123
Re.finditer () returns an iterator to use the iterator method to take a value
Re.findall will first match the data within the parentheses, showing only the inside of the grouping. If you want to ungroup, write a question mark at the beginning of the group with a colon
Re.split-Priority Query
If the conditions of the query are grouped, then the results can be preserved based on the matching results.
For example
ret = Re.split (' (\d) ', ' egon3ioo4fkj ')
That way, 3 and 4 can all be preserved.
The number of data to take who, first group, and then give to get something to take a name, the front plus? P<> in the tip of the horn to get the name of the data
With an already existing grouping (? P=name) The front corner number is the creation of such a grouping, which is followed by the grouping
Grouping: The whole quantifier constraint on multiple characters
For a matching regular, only the content that I need is grouped
A regular expression is a filter rule for a string
The RE module is a python-supplied module that operates regular rules
Python Learning day19 Regular module