The Dictionary object is one of the most commonly used data structures in Python, and the number, string, list, and tuple are tied to the 5 basic structure, and the elements in the dictionary are accessed by keys rather than by offsets like lists. The author summarizes some common pyhonic usages of dictionaries, which is the pythonic usage of dictionaries.
0. Use In/not in to check if key exists in the dictionary
When judging whether a key exists in a dictionary, the general beginner thinks of the method is to return the dictionary all keys in the form of a list, and then determine if the key exists in the key list:
Dictionary = {}
Keys = dictionary. Keys()
For K in keys:
if key = = k:
print True
Break
A more pythonic usage is to use the In keyword to determine whether a key exists in the dictionary:
If key in dictionary:
print True
else:
print False
1. Initialize dictionary key values with SetDefault ()
When you use a dictionary, you often encounter a scenario where you dynamically update a dictionary, like the following code, if key is not in dictionary, then add it and initialize its corresponding value to an empty list [] and then append the element to an empty list:
Dictionary = {}
If "key" not in dictionary:
dictionary["key"] = []
Dictionary["key"]. Append("List_item")
Although this code does not have any logic errors, we can use SetDefault to achieve a more pyhonic notation:
Dictionary = {}
Dictionary. SetDefault("key", []). Append("List_item")
When the dictionary calls the SetDefault method, it first checks to see if the key exists, and if there is nothing to do with the method, the key is created if it does not exist, and the second parameter [] is the value corresponding to the key.
2. Initialize the dictionary with Defaultdict ()
When initializing a dictionary, if initially you want all key values to be a default initial value, such as having a batch of users with a credit score of 100 initially, you now want to add 10 points to a user
D = {}
If ' A ' isn't in d:
d[' a '] =
D[' a '] + = Ten
The same code does not have any problems, for the more pyhtonic the wording is:
From collections import defaultdict
D = defaultdict(lambda: +)
D[' a '] + = Ten
Defaultdict is located under the collections module, which is a subclass of the Dict class, and the syntax structure is:
Class Collections.defaultdict ([default_factory[, ...])
The first parameter, Default_factory, is a factory method, which is called every time a key is initialized, and value is the default_factory returned, and the remaining arguments are the same as the parameters that are received by the Dict () function.
3. Iterate Big Data using Iteritems ()
When iterating over a big data dictionary, if you are using the items () method, before iterating, the iterator needs to fully load the data into memory, which is not only processing very slowly and wasting memory, the following code accounts for about 1.6G of memory (why 1.6G?):
Http://stackoverflow.com/questions/4279358/pythons-underlying-hash-data-structure-for-dictionaries)
D = {i: i * 2 for i in xrange(10000000 )}
for key, value in d. Items():
Print("{0} = {1}". Format(key, value))
Instead of using the Iteritem () method to replace items (), the result is the same, but the memory consumed is 50% lower and why is the gap so great? Because items () returns a list,list that all elements are preloaded into memory at iteration time, and Iteritem () returns an iterator (iterators), the iterator iterates through the generation of elements one at a time.
D = {i: i * 2 for i in xrange(10000000 )}
for key, value in d. Iteritem():
Print("{0} = {1}". Format(key, value))
4. Efficient merging of dictionaries
Common methods
Merging multiple dictionaries can be accomplished with one line of code:
X = {' a ': 1, ' B ': 2}
Y = {' B ': 3, ' C ': 4}
Z = dict(x. Items() + y. Items())
This kind of writing looks very pythonic, but careful analysis, it is not very efficient execution, items () The method returns a list object in python2.7, two lists are added to a new list, so there are 3 list objects in memory, and if the size of the two list is 1G, then executing this code takes up 4G of space to create the dictionary. In addition, this code will error in Python3 because items () in Python3 return the Dict_items object, not the list.
>>> C = dict(a. Items() + b. Items())
Traceback (mostrecent ):
File "<stdin>", line 1, in <module >
TypeError: unsupported operand type(s) for +: ' dict_items ' and ' Dict_items '
In Python3, you need to explicitly cast to a list object:
z = dict (list (X.items ()) + list (Y.items ()))
Pythonic method
A new approach to pythonic is provided in Python3.5:
z = {**x, **y}
However, given that most systems are still based on Python2, a more compatible pythonic approach is:
Z = x. Copy()
Z. Update(y)
Of course, you can encapsulate it as a function:
def merge_dicts(*Dict_args):
" "
can receive 1 or more dictionary parameters
‘‘‘
result = {}
for dictionary in dict_args:
result. Update(dictionary)
return result
Z = merge_dicts(a, b, c, d, e, F, g)
Other methods
There are other ways to merge dictionaries, but performance is not necessarily optimal, for example: python2.7 can support dictionary derivation
{k:v for D in dicts for K, V in D.items ()}
python2.6 and the following versions are used
{k:v for D in dicts for K, V in D.items ()}
Performance comparison
Import Timeit
>>> min(timeit. Repeat(lambda: {* *x, * *y})) # python3.5
0.4094954460160807
>>> min(timeit. Repeat(Lambda: merge_two_dicts(x, y)))
0.5726828575134277
>>> min(timeit. Repeat(lambda: {k: v for D in (x, y) for K, v in d. Items()}) )
1.163769006729126
>>> min(timeit. Repeat(Lambda: dict(k, v) for d in (x, y) for K, v in d. Items() )))
2.2345519065856934
Using {**x, **y} directly in python3.5 is the fastest, using the update followed by the dictionary derivation is relatively the slowest.
Pythonic Usage of Dictionary objects (previous)