Collections usage tutorial for Python Standard Library

Source: Internet
Author: User
Python provides four basic data structures: list, tuple, dict, and set. However, when processing large amounts of data, the four data structures are obviously too single. for example, the efficiency of inserting a list as a one-way linked list in some cases is relatively low. sometimes we also need to maintain an orderly dict. At this time, we need to use the collections package provided by the Python standard library. it provides multiple useful collection classes and is familiar with these collection classes, it not only makes the code written more Pythonic, but also improves the running efficiency of our program. Introduction

Python provides four basic data structures: list, tuple, dict, and set. However, when processing large amounts of data, the four data structures are obviously too single. for example, the efficiency of inserting a list as a one-way linked list in some cases is relatively low. sometimes we also need to maintain an orderly dict. At this time, we need to use the collections package provided by the Python standard library. it provides multiple useful collection classes and is familiar with these collection classes, it not only makes the code written more Pythonic, but also improves the running efficiency of our program.

Use of defaultdict

Defaultdict (default_factory) adds default_factory to a common dict (dictionary), so that the corresponding type of value (value) is automatically generated if the key (key) does not exist ), the default_factory parameter can be specified as list, set, int, and other legal types.

Example1

>>> from collections import defaultdict>>> s = [('red', 1), ('blue', 2), ('red', 3), ('blue', 4), ('red', 1), ('blue', 4)]

We now have such a list as above. although we have 6 groups of data, we can see that we only have two colors ), however, each color corresponds to multiple values. Now we want to convert this list into a dict (dictionary). The dict key (key) corresponds to a color, dict value (value) set to a list to store multiple values corresponding to color. We can use defaultdict (list) to solve this problem.

# D can be considered as a dict (dictionary). The value of dict is a list> d = defaultdict (list) >>> for k, v in s :... d [k]. append (v)... >>> ddefaultdict (
 
  
, {'Blue': [2, 4, 4], 'Red': [1, 3, 1]})
 

Example2

The above example contains some imperfections, such as {'Blue': [2, 4, 4], 'Red': [1, 3, 1]} the defaultdict blue color contains two 4, red color contains two 1, but we do not want to contain repeated elements. in this case, we can consider using defaultdict (set) to solve this problem. The difference between set and list is that the same element cannot exist in set.

>>> d = defaultdict(set)>>> for k, v in s:...     d[k].add(v)...>>> ddefaultdict(
 
  , {'blue': {2, 4}, 'red': {1, 3}})
 

Example3

>>> s = 'hello world'

You can use defaultdict (int) to calculate the number of characters in a string.

>>> d = defaultdict(int)>>> for k in s:...     d[k] += 1...>>> ddefaultdict(
 
  , {'o': 2, 'h': 1, 'w': 1, 'l': 3, ' ': 1, 'd': 1, 'e': 1, 'r': 1})
 

Use of OrderedDict

We know that the default dict (dictionary) is unordered, but in some cases we need to keep the order of dict. at this time, we can use OrderedDict, which is a subclass (subclass) of dict ), however, the order type of dict is maintained on the basis of dict. let's take a look at the usage method.

Example1

>>> From collections import OrderedDict # unordered dict >>>> d = {'banana ': 3, 'apple': 4, 'pear': 1, 'Orange ': 2}

This is an unordered dict (dictionary). now we can use OrderedDict to make this dict orderly.

# Sort d by key> OrderedDict (sorted (d. items (), key = lambda t: t [0]) OrderedDict ([('apple', 4), ('banana ', 3), ('Orange ', 2), ('pear ', 1)]) # Sort d by value> OrderedDict (sorted (d. items (), key = lambda t: t [1]) OrderedDict ([('pear ', 1), ('Orange', 2), ('banana ', 3), ('apple', 4)]) # Sort d by key length> OrderedDict (sorted (d. items (), key = lambda t: len (t [0]) OrderedDict ([('pear ', 1), ('apple', 4 ), ('Orange ', 2), ('banana', 3)])

Example2

The popitem (last = True) method allows us to delete the key-value in dict in the order of LIFO (advanced and later), that is, to delete the last inserted key-value pair, if last = False, the key-value in dict is deleted according to FIFO (first-in-first-out.

>>> D = {'banana ': 3, 'apple': 4, 'pear': 1, 'Orange ': 2} # Sort d by key> d = OrderedDict (sorted (d. items (), key = lambda t: t [0]) >>> dOrderedDict ([('apple', 4), ('banana ', 3 ), ('Orange ', 2), ('pear', 1)]) # Use the popitem () method to remove the last key-value pair >>>> d. popitem () ('pear ', 1) # use popitem (last = False) to remove the first key-value pair >>> d. popitem (last = False) ('apple', 4)

Example3

Use move_to_end (key, last = True) to change the key-value order of ordered OrderedDict objects, by using this method, we can insert any key-value in the ordered OrderedDict object to the beginning or end of the dictionary.

>>> D = OrderedDict. fromkeys ('ABCDE') >>> dOrderedDict ([('A', None), ('B', None), ('C', None ), ('D', None), ('e', None)]) # Move the key-value pair whose key is B to the end of dict> d. move_to_end ('B') >>> dOrderedDict ([('A', None), ('C', None), ('D', None ), ('e', None), ('B', None)]) >>> ''. join (d. keys () 'acdeb' # Move the key-value pair whose key is B to the front of dict> d. move_to_end ('B', last = False) >>> ''. join (d. keys () 'bacde'

Deque usage

The advantage of using list to store data is that searching for elements by index will be very fast, but inserting and deleting elements will be very slow, because it is the data structure of a single-chain table. Deque is a two-way list for efficient insert and delete operations. it is suitable for queues and stacks and thread security.

List only provides append and pop methods to insert/delete elements from the end of the list. However, deque adds appendleft/popleft to allow us to efficiently insert/delete elements at the beginning of the element. In addition, the algorithm complexity of adding (append) or pop elements to both ends of the queue using deque is about O (1 ), however, operations such as pop (0) and insert (0, v) that change the length and data location of a list object are as complex as O (n ). Since the deque operation is basically the same as the list operation, we will not repeat it here.

Use of ChainMap

ChainMap is used to combine multiple dict (dictionaries) into a list (just a metaphor). It can be understood as combining multiple dictionaries, but it is different from update and is more efficient.

>>> From collections import ChainMap >>> a = {'a': 'A', 'C': 'C' }>>>> B = {'B ': 'B', 'C': 'D '}>>> m = ChainMap (a, B) # construct a ChainMap object >>> mChainMap ({'a ': 'A', 'C': 'C'}, {'B': 'B', 'C': 'D'}) >>> m ['A'] 'A' >>> m ['B'] 'B' # Convert m to a list >>> m. maps [{'a': 'A', 'C': 'C'}, {'B': 'B', 'C ': 'D '}] # Updating the value in a will also affect the ChainMap object >>> a ['c'] = 'e' >>>> m ['c']' e' # Copy a ChainMap object from m, updating the copied object does not affect m> m2 = m. new_child () >>> m2 ['c'] = 'F' >>> m ['c'] 'E' >>> a ['c'] 'E' >>>> m2.parentsChainMap ({ 'A ': 'A', 'C': 'C'}, {'B': 'B', 'C': 'D '})

Counter usage

Example1

Counter is also a subclass of dict. it is an unordered container and can be seen as a Counter to count the number of related elements.

>>> From collections import Counter >>> cnt = Counter () # count the number of elements in the list >>> for word in ['red', 'blue ', 'Red', 'green', 'blue', 'blue']:... cnt [word] + = 1...> cntCounter ({'Blue': 3, 'Red': 2, 'green': 1 }) # count the number of elements in a string >>> cnt = Counter () >>> for ch in 'hello ':... cnt [ch] = cnt [ch] + 1...> cntCounter ({'L': 2, 'O': 1, 'H': 1, 'E': 1 })

Example2

The elements () method is used to return an iterator (iterator) based on the number of occurrences of an element. the element is returned in any order. if the number of elements is less than 1, it is ignored.

>>> C = Counter (a = 4, B = 2, c = 0, d =-2) >>> cCounter ({'a': 4, 'B ': 2, 'C': 0, 'D':-2}) >>> c. elements ()
 
  
>>> Next (c) 'A' # Sort >>> sorted (c. elements () ['A', 'B']
 

Return a list using most_common (n). The list contains up to the first n elements in the Counter object.

>>> c = Counter('abracadabra')>>> cCounter({'a': 5, 'b': 2, 'r': 2, 'd': 1, 'c': 1})>>> c.most_common(3)[('a', 5), ('b', 2), ('r', 2)]

Use of namedtuple

Use namedtuple (typename, field_names) to name the elements in tuple to make the program more readable.

>>> from collections import namedtuple>>> Point = namedtuple('PointExtension', ['x', 'y'])>>> p = Point(1, 2)>>> p.__class__.__name__'PointExtension'>>> p.x1>>> p.y2

The above is the course for using collections in the Python standard library. For more information, see PHP Chinese website (www.php1.cn )!

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.