In Python, the properties of data and the methods of handling data are collectively referred to as attributes. In fact, the method is just a callable property. In addition to these two, we can also create attributes, without changing the class interface, using the access method (that is, read the value and set the value method) to modify the property
Python provides a rich API for controlling access to properties and implementing dynamic properties. When we access the Data property of obj, a similar Obj.data,python interpreter calls a special method, such as __getattr__ or __setattr__, to compute the property. A user-defined class can implement a virtual property through the __getattr__ method, and the value of the property is calculated instantly when the accessed property does not exist
We first download a complex JSON file from the remote and save it locally
From urllib.request import urlopenimport warningsimport osimport jsonurl = ' http://www.oreilly.com/pub/sc/osconfeed ' JSON = ' Data/osconfeed.json ' def load (): if not os.path.exists (JSON): msg = ' downloading {} to {} '. Format (URL, JSO N) Warnings.warn (msg) # <1> with Urlopen (URL) as remote, open (JSON, ' WB ') as Local: # <2>< C6/>local.write (Remote.read ()) with open (JSON, encoding= "Utf-8") as FP: return json.load (FP) # <3>
Here, we show only part of the JSON file's contents
{"Schedule": {"conferences": [{"Serial": "{"}], "events": [{"Serial": 34505, "name": "Why schools don′t use Open Source to T Each programming "," Event_type ":" 40-minute Conference Session "," Time_start ":" 2014-07-23 11:30:00 "," Time_stop ":" 2014-07-23 12:10:00 "," venue_serial ": 1462," description ":" Aside from the fact "," school programming ... "," Website_ URL ":" http://oscon.com/oscon2014/public/schedule/detail/34505 "," Speakers ": [157509]," categories ": [" Education "]} ], "Speakers": [{"Serial": 157509, "name": "Robert Lefkowitz", "photo": null, "url": "http://sharewave.com/", "position": "CTO", "Affiliation": "Sharewave", "Twitter": "Sharewaveteam", "Bio": "Robert′r0ml′lefkowitz is the CTO at Sharewave, a STA Rtup ... "}]," venues ": [{" Serial ": 1462," name ":" F151 "," category ":" Conference Venues "}]}}
We can see the file's JSON file inside the first key is schedule, the value of this key is also a dictionary, the dictionary also has four keys, namely "conferences", "events", "speakers" and "venues", The four keys correspond to a single list, with hundreds or thousands of records in the complete set of data. However, there is only one record in the list for the "conferences" key, each of the 4 lists has a field named "Serial", which is the unique identifier of the element in each list.
Below, let's print the length of the list for "conferences", "events", "speakers" and "venues"
feed = Load () print (sorted (feed["Schedule"].keys ())) print (feed["Schedule" ["Speakers"][-1]["name"]) print (feed[" Schedule "[" Speakers "][40][" name "]) for key, value in sorted (feed[" Schedule "].items ()): print (" {: 3} {} ". Format ( Len (value), key))
Operation Result:
[' Conferences ', ' events ', ' speakers ', ' venues '] Carina C. Zonatim Bray 1 conferences494 events357 speakers Venues
From the above example, we can see that if you want to access a key, you must use the lengthy notation of feed[' Schedule ' [' Events '][40][' name '], we can try to implement a Frozenjson class to wrap our original JSON data , and then to feed. Access to the dictionary in the form of schedule.events
From collections import Abcclass Frozenjson: def __init__ (self, mapping): self.__data = dict (mapping) # <1> def __getattr__ (self, name): # <2> if Hasattr (self.__data, name): return GetAttr ( Self.__data, name) else: return Frozenjson.build (Self.__data[name]) @classmethod def build (CLS, obj): # <3> if isinstance (obj, ABC. Mapping): return cls (obj) elif isinstance (obj, ABC. mutablesequence): return [Cls.build (item) for item in obj] else: return obj
- The Frozenjson class receives a dictionary and copies a copy of the dictionary as its own property
- When we call a property of an instance of Frozenjson, such as frozenjson.attr, if attr is not a property of the instance itself, the __getattr__ method is called, in which we implement the first check that attr is not a property of the dictionary. If it is, returns the property, if it is not a key to the dictionary as the property to be accessed, takes its value out of the incoming Frozenjson.build () method and returns
- Frozenjson.build () is a class method that receives an object that, if it is a Dictionary object, invokes its own initialization method, iterates over all elements of the object, and then re-passes the element into its own build () method, and finally, If the object is neither a dictionary nor an iterative object, the original object is returned without modification
Now, let's try the Frozenjson class, we get the original JSON file, we pass in the Frozenjson class to initialize a Frozenjson instance, we access the keys in the original dictionary by Obj.attr Way, we can also call key (), items () and other methods
Raw_feed = Load () feed = Frozenjson (raw_feed) print (len (feed). schedule.speakers)) print (sorted (feed). Schedule.keys ()))) for key, value in sorted (feed. Schedule.items ()): print ( ' {: 3} {} '. Format (len (value), key)) print (feed. Schedule.speakers[-1].name) talk = feed. Schedule.events[40]print (talk) print (Talk.name)
Operation Result:
357[' conferences ', ' events ', ' speakers ', ' venues '] 1 conferences494 events357 speakers, Venuescarina C. zona< Class ' __main__. Frozenjson ' >there *will* be Bugs
The key to the Frozenjson class is the __getattr__ method, and the interpreter calls a special __getattr__ method only if the property cannot be obtained in the usual way (that is, the specified property cannot be found in an instance, class, or superclass). Reading a nonexistent property throws a Keyerror exception instead of a Attributeerror exception that is usually thrown.
The Frozenjson class has only two methods (__init__ and __getattr__) and one instance property __data. Therefore, trying to get additional properties triggers the interpreter to call the __getattr__ method. This method first looks at whether the Self.__data dictionary has a property (not a key) that specifies the name, so that the Frozenjson instance can handle all the methods of the dictionary, such as delegating the items method to the Self.__data.items () method. If Self.__data does not specify the name of the genus
, then the __getattr__ method takes that name as the key, gets an element from the Self.__data, and passes it to the Frozenjson.build method. This allows you to drill down into the nested structure of the JSON data and use the class method build to transform each layer of nesting into a Frozenjson instance.
When we pass in the Dictionary object contains the key is a keyword, such as the following example, if we want to access Grad.class, is bound to error, this time we can only through the GetAttr () method to access
Grad = Frozenjson ({"Name": "Jim Bo", "Class": 1982}) Print (GetAttr (Grad, "class"))
Alternatively, we can transform the __init__ method, when checking out a key is a keyword, automatically add an underscore _, so that when we want to access the class attribute in Grad, directly with the Grad.class_ on the line
def __init__ (self, mapping): Import Keywordself.__data = {}for key, value in Mapping.items (): If Keyword.iskeyword (key): Key + = "_" Self.__data[key] = value
However, if sometimes the key we want to access is not a valid identifier, such as 2_a is also a key passed in the dictionary, but it is not a valid identifier, this if the call grad.2_a will throw Syntaxerror:invalid token error
Grad = Frozenjson ({"Name": "Jim Bo", "Class": 1982, "2_a": "Hello"}) print (GetAttr (grad, "2_a"))
So, we can only get the value through the GetAttr () method, or we, as before, by changing the key value to legal, then to access, we re-frozenjson the initialization of the __init__ method
def __init__ (self, mapping): Import Keywordself.__data = {}for key, value in Mapping.items (): If Keyword.iskeyword (key): Key + = "_" If not Str.isidentifier (key): Key = "_{0}". Format (key) Self.__data[key] = value
This method iterates through all the keys of the mapping, when the key is the keyword, underline the end of the key, when the key is not the keyword, underline the key's head, and then we access the class and 2_a properties.
Grad = Frozenjson ({"Name": "Jim Bo", "Class": 1982, "2_a": "Hello"}) print (Grad.class_) print (grad._2_a)
Operation Result:
1982hello
The situation can be slightly more complicated when the key is used as an identifier, because the key can contain a multiplication sign or a plus sign, and if it contains some special symbols, it can only be obtained by GetAttr ().
We usually say that __init__ is called the construction method, in fact, is used to build the instance is a special method __new__ method, this is a class method, in a special way, so do not add @classmethod adorner, this method will return an instance, The instance is passed in as the first argument (that is, self) into the __init__ method. Because the __init__ method is called to pass in the instance, and it is forbidden to return any values, so __init__ is actually the initialization method, the real construction method __new__, we almost do not need to write the __new__ method, because the object class inherits the implementation is sufficient.
Class Foo (object): def __init__ (self, bar): Self.bar = Bardef object_maker (the_class, Some_arg): new_ Object = the_class.__new__ (the_class) if Isinstance (New_object, The_class): the_class.__init__ (New_object, Some_arg) return new_objectdemo1 = foo ("bar1") Demo2 = Object_maker (Foo, "Bar2") print (demo1) print (demo1.bar) print (DEMO2) print (Demo2.bar)
Operation Result:
<__main__. Foo object at 0x0000005184eb1f98>bar1<__main__. Foo Object at 0x0000005184eb11d0>bar2
As you can see, we can either initialize an object with the Foo class, or we can pass Foo and the required parameters into the Object_maker method to construct an object we need
Now, let's replace the build () method of the Frozenjson class with the __new__ method
From collections import Abcclass Frozenjson: def __new__ (CLS, arg): if Isinstance (ARG, ABC. Mapping): return Super (). __new__ (CLS) elif isinstance (ARG, ABC. mutablesequence): return [CLS (item) for item in ARG] else: return arg def __init__ (self, mapping): self.__data = dict (mapping) def __getattr__ (self, name): if Hasattr (self.__data, name): return GetAttr (self.__data, name) else: return Frozenjson (Self.__data[name])
In __getattr__, if the access key is not a property of the __data itself, we no longer call the Frozenjson.build () method to pass in, but instead pass the argument into the Frozenjson constructor method, This method may return a Frozenjson instance, or it may not be, as we all know, when Python constructs an instance, it first calls the __new__ method to return an instance, and then initializes the properties of the instance with the __init__ method, in Frozenjson, When ARG is only a mapping type, the Frozenjson instance is returned, and when Arg is a list or other type, the returned object is not Frozenjson, and when the Python interpreter gets the object, it will compare __new__ The returned instance is not the same type as the instance it is creating, only the same type, the Python interpreter then calls __init__ for initialization, or directly returns the instance from the __new__ method
Shelve is similar to a persistent dictionary, and he has an open () function that takes a parameter that is the file name and then returns a shelve. Shelf instance, we can use him to store some key value pairs, when the storage is complete, call the close function to close, shelve has the following features:
- Shelve. Shelf is ABC. Mutablemapping, thus providing an important way to deal with the type of mapping
- In addition, shelve. The Shelf class also provides several ways to manage I/O, such as sync and close; It is also a context manager
- As soon as the new value is assigned to the key, the key and value are saved
- Key must be a string
- The value must be an object that can be processed by the Pickle module (the module of the serializable object)
Back to our previous download from the Web JSON file, before we parse this file, the first key inside the file is schedule, and this key corresponding to the dictionary has four keys conferences, events, speakers, venues, and these four keys corresponding to the value, It's a list of many dictionary objects, and we don't have to know exactly what this file is about, now we just need to know a little bit about every Dictionary object in the list of four keys for conferences, events, speakers, venues, Contains a key called serial, the key corresponding to the value is a number, now, let us traverse the schedule under the four keys, and the list of each Dictionary object under the serial key corresponding to the value of the combined knot
Class Record: def __init__ (self, **kwargs): # <6> self.__dict__.update (Kwargs) def load_db (db): raw_data = Load () # <3> for collection, rec_list in raw_data[' Schedule '].items (): Record_type = Collection[:-1] # <4> for record in rec_list: key = ' {}.{} '. Format (Record_type, record[' serial ') # <5> record[' serial '] = key Db[key] = record (**record) Import shelvedb_name = ' data/schedule1_db ' CONFERENCE = ' conference.115 ' db = Shelve.open (db_name) # <1>if CONFERENCE not in DB: # <2> load_db (db)
- Request a Shelve.shelf instance based on db_name first
- Get the db (i.e. shelve. Shelf instance), check if the conference.115 key is in db, and if not, call the load_db () method to start loading the data
- Get the JSON file again using the load () method
- We traverse the JSON file under the schedule corresponding four keys, respectively, is conferences, events, speakers, venues, and collection[:-1] represents the removal of the last letter of the four keys, that is, s, and then stored in the key
- Rec_list represents a list of the 4 keys that contain the Dictionary object, we traverse the list, take out the serial key of each Dictionary object, and combine with key, for example: "conferences": [{"Serial": 115},{"Serial": 116} ] will form two keys, respectively, conference.115,conference.116, of course, in our JSON file, conferences this key has only one {"Serial": 115} object, and does not have {"Serial": 116} Object, this is just an example, after which we replace the original serial value with a key and initialize a Record object
- We pass a dictionary to the initialization method of the record, Self.__dict__.update (Kwargs), which initializes all keys in the dictionary Kwargs to the property of the Record object, that is, self.__dict__ This dictionary, which has all the properties of this object
Then we try to access this JSON file through shelve
Speaker = db[' speaker.3471 ']print (Type (speaker)) print (Speaker.name, speaker.twitter) db.close ()
Operation Result:
<class ' Schedule1. Record ' >anna Martelli ravenscroft Annaraven
As you can see here, speakers, the name of the dictionary where serial is 3471 and whether the value of Twitter corresponds to the value one by one that we print out, here's another thing to remember is that after you open the DB, remember to close the DB
There are two keys in every dictionary under events, one is venue_serial, the other is speakers, let's extend the previous record class so that the venue or speakers we visit under event is no longer a cold ID, Instead, an entity dictionary associated to venues or speakers
As shown, we have expanded two classes on the basis of the original record class, namely Dbrecord and Event,dbrecord inherit from the record, and event inherits from Dbrecord
Class Record: def __init__ (self, **kwargs): self.__dict__.update (Kwargs) def __eq__ (self, Other): If Isinstance (Other, Record): return self.__dict__ = = other.__dict__ else: return notimplemented
The first is the record class, and we see that there is no change in the __init__ method, just a __eq__ method that compares the __dict__ contained in the two record (that is, the properties of the class itself) for equality, and if not equal, returns a notimplemented, Here's more about notimplemented, the built-in constant.
Class A: def __init__ (self, num): self.num = num def __eq__ (self, Other): print ("Call A __eq__") Return Notimplementedclass B: def __init__ (self, num): self.num = num def __eq__ (self, Other): print (" Call B __eq__ ") return self.num = = Other.numa = A (1) b = B (1) print (A = = b)
Operation Result:
Call A __eq__call B __eq__true
As you can see, the __eq__ method of Class A, regardless of what is passed in, will eventually return a notimplemented, and when the Python interpreter receives a notimplemented constant, it calls b.__eq__ (a) to compare
Class Missingdatabaseerror (RuntimeError): "" "" "" raised when a database is required and was not set. "" Class Dbrecord (Record): __db = None @staticmethod def set_db (db): # <1> dbrecord.__db = db @staticmethod def get_db (): # <2> return dbrecord.__db @classmethod def fetch (CLS , ident): # <3> db = cls.get_db () try: return db[ident] except TypeError: if DB is None: # <4> msg = "Database not set; Call ' {}.set_db (my_db) ' " Raise Missingdatabaseerror (Msg.format (cls.__name__)) else: raise def __repr__ (self): # <5> if hasattr (self, ' serial '): cls_name = self.__class__.__name__ Return ' <{} serial={!r}> '. Format (cls_name, self.serial) else: return Super (). __REPR__ ()
- Sets the data source, which is shelve. Shelf instances
- Get Data source
- Pass in a key to get the corresponding value from the data source
- Throw a missingdatabaseerror error when the data source is None
- Redefine printing information for the current record object
Class Event (Dbrecord): @property def venue (self): key = ' venue.{} '. Format (self.venue_serial) h = self.__class__.fetch (key) return Self.__class__.fetch (key) @property def speakers (self): if isn't hasattr (self, ' _speaker_objs '): spkr_serials = self.__dict__[' Speakers '] fetch = Self.__class__.fetch SELF._SPEAKER_OBJS = [Fetch (' speaker.{} '. Format (key)) for key in Spkr_serials] return SELF._SPEAKER_OBJS def __repr__ (self): if Hasattr ( Self, ' name '): cls_name = self.__class__.__name__ return ' <{} {!r}> '. Format (Cls_name, Self.name) Else: return Super (). __REPR__ ()
The event class is primarily used to traverse the dictionary under events, and when we visit the Venue property, he combines the values of his own venue_serial to form venue. {venue_serial}, return the data source again, similarly speakers, when you want to access speakers from the event, first check whether _SPEAKER_OBJS exists, if not, then take out its own speakers object, this is a list, It contains the IDs of multiple speakers, and then traverses the list to fetch the corresponding speakers from the data source and cache the _speaker_objs property
Here, we need to revamp the previous load_db method
def load_db (db): raw_data = Load () for collection, rec_list in raw_data[' Schedule '].items (): Record_type = Collection[:-1] # <1> cls_name = Record_type.capitalize () # <2> cls = Globals (). Get (Cls_ Name, Dbrecord) # <3> if Inspect.isclass (CLS) and Issubclass (CLS, Dbrecord): # <4> Factory = CLS else: factory = Dbrecord for record in rec_list: # <5> key = ' {}.{} '. Format (Record_type, record[' serial ')) record[' serial '] = key Db[key] = Factory (**record) # <6> Import shelvedb = Shelve.open (db_name) if CONFERENCE not in db: load_db (db) dbrecord.set_db (db) # <9>
- Collection is still the schedule under the four keys, respectively conferences, events, speakers, Venues,collection[:-1] can be the last of the four keys to remove the S character
- The four keys of the trailing s will be removed to capitalize the first letter
- After the above two steps, four keys are conference, Event, Seaker, Venue, check whether there is a global domain with the four key names of the same type, if there is to take out, no return dbrecord, because we previously defined the Event class, So when the events key is traversed, the previously defined event class is removed, while the other three keys return by default Dbrecord
- Checks if the type removed from the global domain is a type object and is a subclass of Dbrecord, and assigns a value to factory
- Iterate through all the dictionaries of the four keys, and when traversing to events this key, the event type when the object is stored in the data source, while the other three keys are the Dbrecord type
Then we try to get venue and speakers under the event
event = Dbrecord.fetch (' event.33950 ') print (event) print (Type (event)) print (event.venue) print (event.venue.name) Print (event.speakers) for SPKR in event.speakers: print (' {0.serial}: {0.name} '. Format (SPKR))
Operation Result:
<event ' There *will* be Bugs ' ><class ' schedule2. Event ' ><dbrecord serial= ' venue.1449 ' >portland 251[<dbrecord serial= ' speaker.3471 ';, <DbRecord Serial= ' speaker.5199 ' >]speaker.3471:anna Martelli ravenscroftspeaker.5199:alex Martelli
Python Dynamic properties and features (i)