This article mainly introduces the simple Getting Started Guide for the defaultdict module and namedtuple module in Python. efaultdict inherits from dict, namedtuple inherits from tuple, and is the built-in data type in Python, for more information, see the built-in data types in Python, such as int, str, list, tuple, and dict. The collections module of Python provides several additional data types based on these built-in data types: namedtuple, defaultdict, deque, Counter, OrderedDict, etc, among them, defaultdict and namedtuple are two very useful extension types. Defaultdict inherits from dict and namedtuple inherits from tuple.
I. defaultdict
1. Introduction
When using the Python native data structure dict, if you use the d [key] method for access, if the specified key does not exist, a KeyError exception will be thrown. However, if defaultdict is used, as long as you pass in a default factory method, when a non-existent key is requested, the factory method will be called to use the result as the default value of the key.
When using defaultdict, you need to pass a factory function (function_factory). defaultdict (function_factory) constructs a dict-like object, which has a default value. The default value is generated by calling the factory function.
2. example
The following is an example of defaultdict:
In [1]: from collections import defaultdict In [2]: s = [('xiaoming', 99), ('wu', 69), ('zhangsan', 80), ('lisi', 96), ('wu', 100), ('yuan', 98), ('xiaoming', 89)] In [3]: d = defaultdict(list) In [4]: for k, v in s: ...: d[k].append(v) ...: In [5]: dOut[5]: defaultdict(
, {'lisi': [96], 'xiaoming': [99, 89], 'yuan': [98], 'zhangsan': [80], 'wu': [69, 100]}) In [6]: for k, v in d.items(): ...: print '%s: %s' % (k, v) ...: lisi: [96]xiaoming: [99, 89]yuan: [98]zhangsan: [80]wu: [69, 100]
If you are familiar with Python, you can find that the defaultdict (list) usage is similar to that of dict. setdefault (key, []). The above code is implemented using setdefault as follows:
s = [('xiaoming', 99), ('wu', 69), ('zhangsan', 80), ('lisi', 96), ('wu', 100), ('yuan', 98), ('xiaoming', 89)]d = {} for k, v in s: d.setdefault(k, []).append(v)
3. Principles
In the above example, we can use defaultdict to understand the principle of defaultdict through help. According to the help information printed on the Python console, we can find that defaultdict has a default value mainly through the _ missing _ method. if the factory function is not None, the default value is returned through the factory method, as follows:
def __missing__(self, key): # Called by __getitem__ for missing key if self.default_factory is None: raise KeyError((key,)) self[key] = value = self.default_factory() return value
From the above description, we can find several notes:
A ). the _ missing _ method is called when the _ getitem _ method is called to find that the KEY does not exist. Therefore, defaultdict only uses d [key] or d. the default value is generated only when _ getitem _ (key) is used. get (key) does not return the default value, and KeyError occurs;
B). defaultdict is implemented mainly through the _ missing _ method. Therefore, we can also generate our own defaultdict by implementing this method. the code is written below:
In [1]: class MyDefaultDict(dict): ...: def __missing__(self, key): ...: self[key] = 'default' ...: return 'default' ...: In [2]: my_default_dict = MyDefaultDict() In [3]: my_default_dictOut[3]: {} In [4]: print my_default_dict['test']default In [5]: my_default_dictOut[5]: {'test': 'default'}
4. version
Defaultdict is added after Python 2.5. it is not supported in earlier versions of Python. However, we can implement a defaultdict by ourselves.
# http://code.activestate.com/recipes/523034/try: from collections import defaultdictexcept: class defaultdict(dict): def __init__(self, default_factory=None, *a, **kw): if (default_factory is not None and not hasattr(default_factory, '__call__')): raise TypeError('first argument must be callable') dict.__init__(self, *a, **kw) self.default_factory = default_factory def __getitem__(self, key): try: return dict.__getitem__(self, key) except KeyError: return self.__missing__(key) def __missing__(self, key): if self.default_factory is None: raise KeyError(key) self[key] = value = self.default_factory() return value def __reduce__(self): if self.default_factory is None: args = tuple() else: args = self.default_factory, return type(self), args, None, None, self.items() def copy(self): return self.__copy__() def __copy__(self): return type(self)(self.default_factory, self) def __deepcopy__(self, memo): import copy return type(self)(self.default_factory, copy.deepcopy(self.items())) def __repr__(self): return 'defaultdict(%s, %s)' % (self.default_factory, dict.__repr__(self))
II. namedtuple
Namedtuple is mainly used to generate data objects that can use names to access elements. it is usually used to enhance code readability and is particularly useful for accessing tuple-type data. In fact, most of the time you should use namedtuple to replace tuple, which makes your code easier to understand and more pythonic. For example:
From collections import namedtuple # the variable name and the first parameter in namedtuple are generally consistent, but they can also be different. Student = namedtuple ('student ', 'id name score ') # or Student = namedtuple ('student ', ['id', 'name', 'Score']) students = [(1, 'wu', 90), (2, 'X', 89), (3, 'yuanyuan ', 98), (4, 'wang', 95)] for s in students: stu = Student. _ make (s) print stu # Output: # Student (id = 1, name = 'wu', score = 90) # Student (id = 2, name = 'x ', score = 89) # Student (id = 3, name = 'yuanyuan ', score = 98) # Student (id = 4, name = 'wang', score = 95)
In the preceding example, Student is a namedtuple. it is used in the same way as tuple. it can be obtained directly through index and is read-only. This method is much easier to understand than tuple. you can clearly understand the meaning of each value.