There are some built-in data types in Python, such as int, str, list, tuple, dict, and so on. The Python Collections module provides several additional data types based on these built-in data types: Namedtuple, Defaultdict, deque, Counter, ordereddict, etc. Where Defaultdict and namedtuple are two very useful extension types. Defaultdict inherits from Dict,namedtuple inherited from tuple.
First, Defaultdict
1. Introduction
When using the Python native data structure dict, if you access it in such a way as D[key], a Keyerror exception is thrown when the specified key does not exist. However, if you use Defaultdict, as long as you pass in a default factory method and then request a nonexistent key, the factory method is invoked to use its result as the default value for this key.
Defaultdict needs to pass a factory function (function_factory) in use, and defaultdict (function_factory) constructs a dict-like object that has a default value, The default value is generated by calling the factory function.
2. Example
Here's an example of how to use a defaultdict:
In [1]: From collections import defaultdict in
[2]: s = [(' Xiaoming ',], (' Wu ',), (' Zhangsan ',), (' Lisi ', 96), (' Wu ', m), (' Yuan ', xiaoming)] in
[3]: D = defaultdict (list) in
[4]: for K, V in S:
...: d[ K].append (v)
...: in
[5]: D
out[5]: defaultdict (<type ' list ', {' Lisi ': [d], ' xiaoming ': [99, 89], ' Yuan ': [v.], ' Zhangsan ': [E], ' Wu ': [$]}] in
[6]: for K, V in D.items ():
...: print '%s:%s '% (k, v) ...
:
Lisi: [
to] xiaoming: [$] Yuan: [To
] Zhangsan: [$]
Wu: [69, 100]
The students familiar with Python can find that the usage of defaultdict (list) is similar to Dict.setdefault (key, []), and the above code uses SetDefault to implement the following:
s = [(' Xiaoming ', (' Wu ', Zhangsan '), (' Lisi ',), (' Wu ', m), (' Yuan ',), (' Xiaoming ', m)]
D = {
for K, V in S:
D.setdefault (k, []). Append (v)
3. Principle
From the above example, we can basically use the defaultdict, we can understand the defaultdict principle by Help (defaultdict). By using the Help information printed by the Python console, we can see that defaultdict has a default value that is primarily implemented through the __missing__ method, and that if the factory function is not none, the default value is returned through the factory method, as follows:
def __missing__ (self, Key):
# Called by __getitem__ to missing key
if Self.default_factory is None:
raise key Error ((key,))
Self[key] = value = Self.default_factory () return
value
From the above instructions, we can find a few points to note:
a). The __missing__ method is invoked when the __getitem__ method is invoked to discover that the key does not exist, so defaultdict will only generate default values when using D[key] or d.__getitem__ (key) If using D.get (key) does not return the default value, there will be keyerror;
b). Defaultdict is implemented primarily through the __missing__ method, so we can also generate our own defaultdict by implementing this method:
In [1]: Class Mydefaultdict (dict):
...: def __missing__ (self, key):
...: self[key] = ' Default '
...: return ' Default '
...: in
[2]: My_default_dict = Mydefaultdict () in
[3]: My_default_dict
out [3]: {} in
[4]: Print my_default_dict[' test ']
default in
[5]: My_default_dict
out[5]: {' Test ': ' Default '}
4. Version
Defaultdict is a feature that was added after Python 2.5, which is not supported in older versions of Python, but, knowing its rationale, we can implement a defaultdict ourselves.
# Http://code.activestate.com/recipes/523034/try:from Collections Import defaultdict except:class defaultdict (dict ): Def __init__ (self, default_factory=none, *a, **kw): if (Default_factory are not None and not hasattr (Default_factory, ' __call__ ')): Raise TypeError (' the '-argument must ') callable (self, dict.__init__, * *KW) self.default_factory = Default_factory def __getitem__ (self, key): Try:return Dict.__getit Em__ (self, key) except Keyerror:return self.__missing__ (key) def __missing__ (self, key): If SEL
F.default_factory is none:raise keyerror (key) Self[key] = value = Self.default_factory () return value def __reduce__ (self): if self.default_factory is None:args = tuple () Else:args = self . default_factory, return type (self), args, none, none, Self.items () def copy (self): return self.__copy__ () def __copy__ (SELF): return type (self) (self.default_factory, self) def __deepcopy__ (Self, Memo): Import copy Retur N Type (self) (self.default_factory, Copy.deepcopy (Self.items ())) def __repr__ (self): Return ' Defaultdict (%s,%s
) '% (Self.default_factory, dict.__repr__ (self))
Second, namedtuple
Namedtuple is primarily used to produce data objects that can use names to access elements, and is typically used to enhance the readability of the code, especially when accessing some tuple types of data. In fact, most of the time you should use namedtuple instead of tuple, which will make your code easier to read and more pythonic. As an example:
The From Collections import namedtuple
# variable name is generally consistent with the first parameter in Namedtuple, but it can also be different
Student = namedtuple (' Student ', ' ID Name score ')
# or Student = Namedtuple (' Student ', [' id ', ' name ', ' score '])
students = [(1, ' Wu ',], (2, ' Xing '), (3, ' Yuan ', (+), (4, ' Wang ',)] for s in
students:
stu = Student._make (s)
print Stu
# Output:
# Student (id=1, name= ' Wu ', score=90)
# Student (id=2, name= ' Xing ', score=89)
# Student (id=3, name= ' Yuan ', score=98)
# Student (id=4, name= ' Wang ', score=95)
In the above example, student is a namedtuple, which, like the tuple method, can be directly fetched through the index and is read-only. This approach is much easier to understand than tuple, and you can know exactly what each value means.