In the process of learning web crawler, it is necessary to crawl the number of comments on the Web page, involving regular expressions, and then look at the way. Regular expressions are a common tool in word processing.
1 common strings for regular expressions
. Any single character
The "" Character set gives a range of values for a single character
"^" Non-character set
* Previous character repeats 0 or more times
+ Previous character repeats 1 or more times
? Repeat 0 or 1 times for the previous character
| Or
{m} The previous character expands m times
{M,n} The previous character expands M to n times
^ Match string start
$ Match String End
\d "0-9"
\w Word characters
2 Main function functions
Import re # importing re package # searches the entire string until a matching string is found # Matches the regular expression from the beginning of the string, returns the result m=re.sub (pattern,replacement,string)# finds and replaces M=re.findall () from the string # search string, put all matching substrings in a table return # Returns the iteration type of a matching result, each iteration element is a match object m= Re.split () # splits a string according to the result of a regular expression match, returning the list type
Re.group (number) View the results of the search, group (0) is the search result for the entire expression, group (1) is the first group, and so on.
Python standard library 01 regular expressions