Base Library1. Regular Expressions: ReSymbol
() parentheses--grouping
[] Brackets--The character class, matches any one of the characters #注: Character Set "" in the content as ordinary characters! (except-\^)
{} curly braces--qualifying number of matches
| or R ' ac| AC or AD, D '
. Match any character (except \)
\. Matches the character.
^ Caret matches the starting position of the input string # R ' ^ac '
$ match String End # R ' ac$ '
\b matches the boundary of a word (the word is defined as a letter, a number, an underscore);\b matches a non-word boundary with \b
\d matches any number [0-9]; \d In contrast to \d, [^0-9]
\s matches the white space character "\ t \ r \ \f \v";\s is opposite to \s
\w Match alphanumeric underline (Chinese can also) "A-Z A-Z 0-9 _"; \w is the opposite of \w .
* Match sub-expression 0 or more times, equivalent to {0,}
+ Match sub-expression 1 or more times, equivalent to {1,}
? Match subexpression is 0 or 1 times, equivalent to {0,1}
Greedy mode
Greedy mode (used by default in Python regular expressions), as much as possible to match
''re.search (R'<.+>', s)# output: '
Enable non-greedy mode
Re.search (R'<.+?>', s)# output:
Command
1. Re.search ()
'bo ke yuan'result = Re.search (R'(\w+) (\w+)'# ' Bo '; Result.group (2) # ' Ke '# ##(0, 5) (Match range)
2. Re.findall ()
If the given regular expression contains subgroups, the contents of the group are returned separately.
If multiple subgroups are included, the matched content combination Narimoto Group is returned
How do I make subgroups not capture content?
Non-capturing group (? :) Add all occurrences of sub-groups ? :
3. Re.compile () Compile regular expressions
If you need to use a regular expression repeatedly , you can first compile the regular expression into a schema object.
p = re.compile (R'[A-z]') p.search ('bo ke Yuan# [' B ']p.findall ('bo ke Yuan # [' B ', ' K ', ' Y ']
2. Parameter: Argparse
Basic usage:
import argparse # Step1. Import module parser = Argparse. Argumentparser () # Step2. Create a parameter resolution object parser.add_argument () # Step3. Add parameter positional parameters: Parser.add_argument ("echo", help= "parameter description") required optional parameters: Parser.add_argument ("-- Verbosity ", help=" parameter description ") optional args = Parser.parse_args () # Step4. Parsing parameters location parameter access: Args.echo optional parameter access: Args.verbosity
3. Mathematical function Library: Math4. Random number: RANDOM5. Multithreaded multi-process: subprocess/multiprocessing/threading6. Gadget (can reduce the number of lines of code): Itertools/operator/collections6.1 Collections
c = Collections. Counter(parametric)#parameters can be list, str, tuple, none, and so on#function: The number of occurrences of each element of the statistic parameter#return: A dictionary (the element is stored as a key and the number of occurrences is stored as value)## # # #例子:c = Counter ('Gallahad')#output: Counter ({' A ': 3, ' d ': 1, ' G ': 1, ' H ': 1, ' L ': 2})C.update ('ADC')#update count on original basis (modify C directly)#output: Counter ({' A ': 4, ' C ': 1, ' d ': 2, ' G ': 1, ' H ': 1, ' L ': 2})C.most_common ()#output: [(' A ', 4), (' L ', 2) , (' d ', 1), (' G ', 1), (' B ', 1), (' C ', 1), (' H ', 1)]#equivalent dictionary sort: sorted (C.items (), Key=lambda asd:asd[1], reverse=true)
6.2 Itertools
# format Itertools.chain (*iterables) # function: Converts multiple iterations of an object to a chain a = [[1, 2, 3], ['a'b' c']]itertools.chain(a)# results: 1, 2, 3, ' A ', ' B ', ' C '
Third-party libraries1. Jieba
Import# Full mode word breakers ( commonly used in information retrieval )# exact mode participle (default)# Support Parallel Word segmentation Jieba.enable_paralle (4)# support custom dictionary ' dictionary path ' jieba.load_userdict (filename)
Import= pseg.cut (' I came to Tsinghua University in Beijing ') for in words : print("%s%s"% (word, flag))
2. Paint: Matplotlib3. Network Library: Requests
Python Common library Functions-memo