Here is my collection of some Python tips, is mainly a number of practical functions, suitable for a certain basis of children's shoes to watch (will not specifically describe the use of the standard library functions).
One, function-type programming
Functional programming is very handy for working with data. (If you have a pipe operator |
or a chain call of Java, super cool!) But Python doesn't have any. Need to use third party libraries)
1. Group/group
A common operation in data processing is to set the elements in the list, each of which is a group of K.
def group_each(a, size: int): """ 将一个可迭代对象 a 内的元素, 每 size 个分为一组 group_each([1,2,3,4], 2) -> [(1,2), (3,4)] """ iterators = [iter(a)] * size # 将新构造的 iterator 复制 size 次(浅复制) return zip(*iterators) # 然后 zip
This function was previously written in Python supplements-artifice, which was found on the StackOverflow at some time by Google, but its original source should be somewhere in the official Python document.
By the way, if a size is more commonly used (such as 2), it can be partial
encapsulated
from functools import partial # 每两个分一组group_each_2 = partial(group_each, size=2) # 等同于 group_each_2 = lambda a: group_each(a, 2)
2. Flat version of Map
A little touch of the function should know Flat_map, but the Python standard library is not available. Here is the implementation I found on the StackOverflow, which is actually very simple
from itertools import chaindef flat_map(f, items): return chain.from_iterable(map(f, items))
It differs from the map in that it is not flat (flat) (nonsense). ), to give an example
>>> list(map(list, ['123', '456']))[['1', '2', '3'], ['4', '5', '6']]>>> list(flat_map(list, ['123', '456']))['1', '2', '3', '4', '5', '6']
3. Examples of application of the above functions
Sometimes you encounter a TABLE element when doing a crawler job:
For this HTML element, I will generally convert it directly to list, the result is as follows:
table = [['label1', 'value1', 'label2', 'value2'], ['label3', 'value3'], ['label4', 'value4', 'label5', 'value5'], ... ]
To make the index easier, now I need to convert the above data into a dict like this.
{ 'label1': 'value1', 'label2': 'value2', 'label3': 'value3', 'label4': 'value4', 'label5': 'value5'}
If it's normal, you probably need to write a loop. But if you use a few of the functions you just said, it's going to be incredibly simple.
# 1. 分组groups = flat_map(group_each_2, table)# 1.1 flat_map 返回的是迭代器,list 后内容如下:# [('label1', 'value1'),# ('label2', 'value2'),# ('label3', 'value3'),# ('label4', 'value4'),# ('label5', 'value5')]# 2. 转换成 dictkey_values = dict(groups) # 得到的 key_values 与上面需要的 dict 别无二致。
Up-Pose module
- Iterators: Itertools, the contents of this module, feel all very practical.
- Special data structure: Colletions, also all have the use, I use the most should be defaultdict.
- Function: Functools in the partical, reduce can understand, there are builtins map, filter, zip. (However, the following three functions can actually be substituted with a deduction)
- Functions related to comparisons: sorted, Max, Min, and itertools.groupby, often using Opreator's itemgetter (sometimes Attrgetter/methodcaller) as the parameter key.
- Library of common operations: operator, a function form (subtraction, in, and so on) that contains very many abstract operations and is often used as a parameter function for Reduce/map/filter. More content, it is recommended to use the time to review.
P.S. The use of these modules is best accompanied by detailed comments. (easy to understand afterwards)
Ii. other 1. More dict to the weight
Let's say we have a list of dict, which may have the same content as the dict, and we need to do it again.
The easy-to-think method is to use set, but the elements in set must be hashable, and Dict is unhashable, so it cannot be placed directly into the set.
>>> a = [{'a': 1}, {'a': 1}, {'b': 2}]>>> set(a)Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/IPython/core/interactiveshell.py", line 2961, in run_code exec(code_obj, self.user_global_ns, self.user_ns) File "<ipython-input-5-5b4c643a6feb>", line 1, in <module> set(a)TypeError: unhashable type: 'dict'
Does it have to be hand-written? Not necessarily, I saw such a little trick in StackOverflow
import jsondef unique_dicts(data_list: list): """unique a list of dict dict 是 unhashable 的,不能放入 set 中,所以先转换成 str unique_dicts([{'a': 1}, {'a': 1}, {'b': 2}]) -> [{'a': 1}, {'b': 2}] """ data_json_set = set(json.dumps(item) for item in data_list) return [json.loads(item) for item in data_json_set]
2. The parameters of the StartsWith and endswith two functions of STR can be tuples
In[7]: a = "bb.gif"In[8]: b = 'a.jpg'In[9]: a.endswith(('.jpg', '.gif'))Out[9]: TrueIn[10]: b.startswith(('bb', 'a'))Out[10]: True
Reference
- A summary of some of the most obscure basic techniques in Python
Slowly update, think of anything add what.
This document allows, but requires the source address to be attached