Introduction to defaultdict and _ missing _ () in Python,
Preface
Today, our main character is defaultdict. We will also introduce the imitation method.__missing__()
This article is mainly from the user's blog and shared with people in need. I won't talk much about it below. Let's take a look at the detailed introduction.
The default value is convenient.
As we all know, if a key that does not exist in the dictionary is accessed in Python, A KeyError exception is thrown. However, it is very convenient to have a default value for each key in the dictionary. For example:
strings = ('puppy', 'kitten', 'puppy', 'puppy', 'weasel', 'puppy', 'kitten', 'puppy')counts = {}for kw in strings: counts[kw] += 1
This example counts the number of times a word appears in strings and records it in the counts dictionary. Each time a word appears, add 1 to the value of the key corresponding to counts. But in fact, running this code will throw a KeyError exception, which occurs when each word is counted for the first time. Because the default value does not exist in Python dict, you can verify it in the Python command line:
>>> counts = dict()>>> counts{}>>> counts['puppy'] += 1Traceback (most recent call last): File "<stdin>", line 1, in <module>KeyError: 'puppy'
Use judgment statement check
In this case, the first possible method is to store the corresponding key in counts with the default value of 1 during the first word statistics. You need to add a judgment statement during processing:
strings = ('puppy', 'kitten', 'puppy', 'puppy', 'weasel', 'puppy', 'kitten', 'puppy')counts = {}for kw in strings: if kw not in counts: counts[kw] = 1 else: counts[kw] += 1# counts:# {'puppy': 5, 'weasel': 1, 'kitten': 2}
Use the dict. setdefault () method
You can also use the dict. setdefault () method to set the default value:
strings = ('puppy', 'kitten', 'puppy', 'puppy', 'weasel', 'puppy', 'kitten', 'puppy')counts = {}for kw in strings: counts.setdefault(kw, 0) counts[kw] += 1
dict.setdefault()
The method receives two parameters. The first parameter is the Jian name, and the second parameter is the default value. If no given key exists in the dictionary, the default value provided in the parameter is returned; otherwise, the value saved in the dictionary is returned. Exploitationdict.setdefault()
You can rewrite the code in the for loop to make it more concise:
strings = ('puppy', 'kitten', 'puppy', 'puppy', 'weasel', 'puppy', 'kitten', 'puppy')counts = {}for kw in strings: counts[kw] = counts.setdefault(kw, 0) + 1
Use the collections. defaultdict class
Although the above method solves the problem that the default value does not exist in dict to a certain extent, we will think, is there a dictionary that provides the default value function? The answer is yes, that iscollections.defaultdict
.
The defaultdict class is like a dict, but it is initialized using a type:
>>> from collections import defaultdict>>> dd = defaultdict(list)>>> dddefaultdict(<type 'list'>, {})
The initialization function of the defaultdict class accepts a type as a parameter. When the accessed key does not exist, you can instantiate a value as the default value:
>>> dd['foo'][]>>> dddefaultdict(<type 'list'>, {'foo': []})>>> dd['bar'].append('quux')>>> dddefaultdict(<type 'list'>, {'foo': [], 'bar': ['quux']})
Note that the default value of this form is onlydict[key]
Ordict.__getitem__(key)
This is valid only when access is made. The reason is described in the following section.
>>> from collections import defaultdict>>> dd = defaultdict(list)>>> 'something' in ddFalse>>> dd.pop('something')Traceback (most recent call last): File "<stdin>", line 1, in <module>KeyError: 'pop(): dictionary is empty'>>> dd.get('something')>>> dd['something'][]
In addition to accepting the type name as the parameter of the initialization function, the defaultdict class can also use any callable function without parameters. Then, the returned result of the function is used as the default value, this makes the default value more flexible. The following example shows how to use a custom function without parameters.zero()
Parameters Used as the initialization function of the defaultdict class:
>>> from collections import defaultdict>>> def zero():... return 0...>>> dd = defaultdict(zero)>>> dddefaultdict(<function zero at 0xb7ed2684>, {})>>> dd['foo']0>>> dddefaultdict(<function zero at 0xb7ed2684>, {'foo': 0})
Exploitationcollections.defaultdict
To solve the original word statistics problem, the Code is as follows:
From collections import defadicdictstrings = ('puppy ', 'kitten', 'puppy ', 'puppy', 'weasel ', 'puppy', 'kitten ', 'puppy ') counts = defaultdict (lambda: 0) # Use lambda to define a simple function for s in strings: counts [s] + = 1
How is the defaultdict class implemented?
Through the above content, we must have understood the usage of the defaultdict class. How can we implement the default value function in the defaultdict class? The key here is to use__missing__()
This method:
>>> from collections import defaultdict>>> print defaultdict.__missing__.__doc____missing__(key) # Called by __getitem__ for missing key; pseudo-code: if self.default_factory is None: raise KeyError(key) self[key] = value = self.default_factory() return value
View__missing__()
Docstring of the method.__getitem__()
Dict [key] is actually__getitem__()
Method is simplified.__missing__()
Obtain the default value and add the key to the dictionary.
About__missing__()
For details about the method, see the Mapping Types-dict section in the Python official document.
This document introduces that, starting from version 2.5, if the subclass derived from dict is defined__missing__()
When the access key does not exist,dict[key]
Yes__missing__()
Obtain the default value.
We can see that although dict supports__missing__()
Method, but this method does not exist in dict itself, but needs to be implemented by itself in the derived subclass. You can simply verify this:
>>> print dict.__missing__.__doc__Traceback (most recent call last): File "<stdin>", line 1, in <module>AttributeError: type object 'dict' has no attribute '__missing__'
At the same time, we can further conduct experiments to define a subclass of Missing and implement__missing__()
Method:
>>> class Missing(dict):... def __missing__(self, key):... return 'missing'...>>> d = Missing()>>> d{}>>> d['foo']'missing'>>> d{}
The returned results reflect__missing__()
The method does play a role. On this basis, we will slightly modify__missing__()
Method to set a default value for a key that does not exist like the defautldict class:
>>> class Defaulting(dict):... def __missing__(self, key):... self[key] = 'default'... return 'default'...>>> d = Defaulting()>>> d{}>>> d['foo']'default'>>> d{'foo': 'default'}
Implement the defaultdict class in the earlier version of Python
The defaultdict class is added after version 2.5. It is not supported in some old versions. Therefore, it is necessary to implement a compatible defaultdict class for the old version. This is actually very simple. Although the performance may not be as good as the defautldict class in 2.5, It is functionally the same.
First,__getitem__()
The method must be called when the access key fails.__missing__()
Method:
class defaultdict(dict): def __getitem__(self, key): try: return dict.__getitem__(self, key) except KeyError: return self.__missing__(key)
Second, we need to implement__missing__()
To set the default value:
class defaultdict(dict): def __getitem__(self, key): try: return dict.__getitem__(self, key) except KeyError: return self.__missing__(key) def __missing__(self, key): self[key] = value = self.default_factory() return value
Then, the initialization function of the defaultdict class __init__()
You need to accept the type or call the function parameters:
class defaultdict(dict): def __init__(self, default_factory=None, *a, **kw): dict.__init__(self, *a, **kw) self.default_factory = default_factory def __getitem__(self, key): try: return dict.__getitem__(self, key) except KeyError: return self.__missing__(key) def __missing__(self, key): self[key] = value = self.default_factory() return value
Finally, based on the above content, complete the code compatible with the new and old Python versions in the following ways:
try: from collections import defaultdictexcept ImportError: class defaultdict(dict): def __init__(self, default_factory=None, *a, **kw): dict.__init__(self, *a, **kw) self.default_factory = default_factory def __getitem__(self, key): try: return dict.__getitem__(self, key) except KeyError: return self.__missing__(key) def __missing__(self, key): self[key] = value = self.default_factory() return value
References
Https://docs.python.org/2/library/collections.html#collections.defaultdict
Summary
The above is all the content of this article. I hope the content of this article has some reference and learning value for everyone's learning or work. If you have any questions, please leave a message to us, thank you for your support.