Fluent_python_section2 data types, 03-dict-set, dictionaries, and collections

Source: Internet
Author: User
Tags hashable

Dictionaries and collections

Both Dict and set are based on hash table implementations

1. Outline:
  1. Common dictionary methods
  2. How to handle a key that cannot be found
  3. variants of dict types in the standard library
  4. Set and fronzenset types
  5. How Hash Table Works
  6. The potential impact of Hash table
Dictionary dict2. Generic Mapping type

In COLLECTIONS.ABC, there are mapping and mutablemapping, two abstract base classes, the main function of which is to define the basic API of the mapping type as a formal document.

= {}#判定数据是不是广义上的映射类型#不用type的原因是这个参数可能不是dictisinstance(my_dict, abc.Mapping)

The mapping types in the standard library are all based on dict extensions, so they have a common limitation, and only the hashable data type can be used as key (the hold key is unique), the value does not have this limit.
Https://docs.python.org/3/glossary.html#term-hashable
Hashable objects to implement the __hash__ () and __eq__ () methods

In general, immutable types are hashtable, but there are special cases where dict is immutable, but the elements inside it can be mutable types.

= (12, (3040))hash(tt)#Error, dict里面的list不是可散列的类型= (12, [3040])hash= (12frozenset([3040]))hash(tf)

Different ways to create dictionaries

>>>A= Dict(One=1, the other=2, three=3)>>>B={' One ':1,' both ':2,' three ':3}>>>C= Dict(Zip([' One ',' both ',' three '], [1,2,3]))>>>D= Dict([(' both ',2), (' One ',1), (' three ',3)])>>>E= Dict({' three ':3,' One ':1,' both ':2})>>>A==B==C==D==ETrue
3. Dictionary derivation formula (dict comprehension, Dictcomp)

List derivation using [], tuple derivation (), dictionary derivation with {}

Dictcomp can be constructed from any iterative element with Key-value as an element in the dictionary

Dial_codes=[    ( the,' China '),    ( the,' India '),    (1,' states '),    ( +,' Indonesia '),    ( -,' Brazil '),    ( the,' Pakistan '),    (880,' Bangladesh '),    (234,' Nigeria '),    (7,' Russia '),    (Bayi,' Japan '),]#dictcomp, {}Country_code={Country:code forCode, countryinchDial_codes}Print(Country_code)Country names for #以大写打印code < 65Temp={Code:country.upper () forCountry, codeinchCountry_code.items ()}Print(temp)
4. Common mapping Type methods

Common methods of P137,dict, Collections.defaultdict and Collections.ordereddict in Chinese electronic books

5. Use SetDefault to process keys that cannot be found (key)

D[key] Keyerror exception will be thrown when not found, can be replaced with D.get (key, default), to find the key to a default return value. However, to update the value of a key, using __getitem__ and get is both unnatural and inefficient. So D.get is not the best way to deal with keys that can't be found.

Chinese e-book P139, treated with SetDefault

= {‘name‘:‘Allen‘‘age‘:18}#setdefault方法如果有key-value就不动,没有就添加。这个方法有返回值print(dict1)dict1.setdefault(‘age‘30)dict1.setdefault(‘xxx‘22)print(dict1)
6. Use Defaultdict to handle the keys you can't find. 7 not understood. Use __missing__ to process keys that cannot be found

Not only dictionaries, but all mapping types are involved in the __missing__ method when dealing with keys that are not found.
That is, if a class inherits Dict, and the inheriting class provides the __missing__ method, then Python automatically calls it when __getitem__ encounters a key that cannot be found, rather than throwing a keyerror exception.
The Note:__missing__ method is only called by __getitem__, such as D[key]. There is also no effect on Get or contains(the in operator is used with this method).

Like K in My_dict.keys () is highly efficient in Python3 because Dict.keys () returns dictionary-view-objects. In Python2, Dict.keys () returns a list, and the K in my_list operation needs to scan the entire list.

8. Variants of the Dictionary

Different mapping types:

  1. Collections. Ordereddict: The order is maintained when the key is added.
  2. Collections. Chainmap: can accommodate several different mapping objects, which are searched as a whole when key lookup operations are performed. This feature is useful when you are interpreting a language with nested scopes, and you can use a mapping object to represent the context of a scope.
  3. Collections. Counter this mapping type can be used to count hashable objects, or as multiple sets. Counter implements the + and-operators to merge Records, and Most_common ([n]) returns the most common n keys and their counts in a map in order.
fromimport=‘aabbccdd‘= Counter(str1)print(type(counter_str))print(counter_str)counter_str.update(‘aabbccdd‘)print(counter_str)print(counter_str.most_common(3))

4.collections. UserDict, this class implements the standard dict in pure python. Let the user inherit to write the subclass.

9. Custom Mapping Types

All mapping types are based on Dict. Custom mapping type, with UserDict as the base class, is more convenient than the common Dict base class.
Because if inherited from Dict, Dict sometimes take some shortcuts on some methods, causing subclasses to override these methods, but userdict will not have these problems. See the Chinese e-book in detail in section 12.1.
Note:userdict is not a subclass of Dict, it is a subclass of Mutablemapping, but one of the properties called data is an instance of Dict, which is where userdict stores data. The benefit is that the userdict subclasses implement __setitem__ and __contains__ when the code is more concise.

Example 1. Add, update, or query operations, STRKEYDICT will convert non-string keys to strings

import  collectionsclass  strkeydict (collections. userdict): def  __missing__ (self , key): if isinstance  (Key, str ): raise  keyerror  (key) return  self  [str
      (key)] def  __contains (self , key): return
       str  (key) in  self . Data def  __setitem__  (self , Key, value): self . Data[str  (key)] =  value  

The remaining mapping types in Strkeydict are inherited from UserDict, mutablemapping, and mapping superclass classes. In particular, mapping, the ABC (abstract base class), provides a useful API in the form of documentation. The following two methods are worth noting:

  1. Mutablemapping.update. Can be used directly, can use in __init__. The principle is self[key] = value to add a new value, so it is using the __setitem__ method.
  2. Mapping.get.
10. Types of immutable mappings

There are immutable sequence types, but there are no immutable dictionary types in the standard library, but you can replace them with aliases.

Dictionaries can be dynamically modified, but do not want to be modified by the user, so a read-only mapping view is required. Types. Mappingproxytype returns a read-only map view.

fromimport= {‘1‘‘A‘= MappingProxyType(d)print(‘d_proxy:‘, d_proxy)#MappingProxyType只读,没有赋值操作#d_proxy[2] = ‘x‘d[2=‘x‘#MappingProxyType是动态的,也就是对d所做的任何改动都会反馈到它上面。print(‘d_proxy:‘, d_proxy)
Set set11. Collection

The essence of set is the aggregation of many unique objects. Therefore, the collection can go to the heavy

= [‘spam‘‘spam‘‘eggs‘‘eggs‘=set(l1)print(s1)print(list(s1))

The elements in the set must be hashable (guaranteed to be unique), but the set itself is non-hashed, but Frozenset is hashable. Therefore, you can create a set that contains different frozenset. The elements inside the set are hashable, so the search speed is very fast.

Set also has a number of basic infix operators, a | B-Set, A & B intersection, A-a difference set. Unnecessary loops and logical operations can be omitted.

Example 1. The number of occurrences of the needles element in the haystack, two variables are set types

=len& haystack)

Example 2. Needles and Haystack are any two iterations of an object

=0forin needles:    ifin haystack:        +=1

Example 3. Convert to set re-operation

=len(set&set(haystack))#另一种写法=len(set(needles).intersection(haystack))
12. Set literal

{1}, {1, 2} is the literal of the collection, set () is the empty set, {} is a null dictionary.

The literal {-*-a} is faster than the construction method (set ([+])). The collection literal, Python uses the Build_set bytecode to create the collection.

fromimport disdis(‘{1}‘)dis(‘set([1])‘)

Creating Frozenset can only be constructed using the constructor method

frozenset(range(10))
13. Set deduction formula (set comprehension, Setcomps)

List derivation by [], tuple derivation with (), dictionary and set derivation with {}

Example 1. Create a Latin-1 character set with Setcomps

#获取字符的名字namefromimport name#把编码32~255之间的字符的名字里有"SIGN"单词挑出来,放到一个集合里面= {chrforinrange(32256if‘SIGN‘in name(chr(i),‘‘)}print(s1)print(name(‘+‘,‘‘))
14. Operation of the collection

COLLECTIONS.ABC (abstract base class), which provides API information.

Set of mathematical operations: Chinese e-book P161
Comparison operations for collections: Chinese e-book P162
Other methods of collection: Chinese e-book P163

principles of Dict and set

Both the dict and the set are implemented using hash table. For example in operation, so fast.
Note: There is no hash table behind the list to support the in operator, and each search is sequential traversal.

16. Hash table in the dictionary

If object A = = object B, then hash (A) = = Hash (b). Call hash (), which is actually running __hash__.

Dict value principle using hash list algorithm, Chinese electronic book P169

17. Using a hash table to achieve the benefits and limitations of dict

Chinese e-book P171

  1. Key must be hashable.
  2. The dictionary has a huge overhead in memory because hash table is a sparse array
  3. Key query is fast because hash table is the space change time of classic
  4. The order of keys depends on the order of additions
  5. Adding a new key to a dictionary may change the order of existing keys, and do not iterate and modify the dictionary at the same time. Because the Python interpreter might make a decision to expand the dictionary, a hash conflict may occur when the old table is copied to a new, larger hash list.
    6. The keys (),. Items (),. VALUES () return the dictionary view, and the dynamic feedback dictionary changes.
18. Using a hash table to achieve the benefits and limitations of Set

The implementations of set and Frozenset also depend on the hash table, but only the element's references (as in the dictionary, not the corresponding values) are stored in their hashes.
Note: Before set joins Python, we use the dictionary with meaningless value as a collection.

The advantages and disadvantages of implementing dict with a hash table are similar to the following:

  1. The elements in the collection must be hashed.
  2. The collection consumes memory very much.
  3. It is very efficient to judge whether an element exists in a collection.
  4. The order of the elements depends on the order in which they are added to the collection.
  5. Adding elements to a collection may change the order of existing elements in the collection.

Fluent_python_section2 data types, 03-dict-set, dictionaries, and collections

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.