Add attributes and generate objects dynamically in python classes.
This article will solve the problem one by one through the following aspects:
1. Main Functions of the program
2. Implementation Process
3. Class Definition
4. Use the generator to dynamically update each object and return the object
5. Remove unnecessary characters Using strip
6. rematch matching strings
7. Use timestrptime to extract strings and convert them into time objects
8. complete code
Main functions of the program
Now there is a document that stores user information like a table: the first row is an attribute. Each attribute is separated by a comma (,). Each row is the value corresponding to each attribute starting from the second row, each row represents a user. How can I read this document and output a user object in each row?
There are also four small requirements:
Each document is very large. If so many objects generated by all rows are saved as a list at a time, the memory will crash. The program can only store objects generated by one row at a time.
Each string separated by commas (,) may have double quotation marks (") or single quotation marks ('), for example," Zhang San ". You must remove the quotation marks. If it is a number, there are more than 000000001.24 characters, remove both ++ and 0 and extract 1.24
The document contains time, which may be or 2:23:56. Convert the string to the time type.
There are many such documents with different attributes. For example, this is the user information and the call record. Which of the following attributes of the class must be dynamically generated based on the first line of the document?
Implementation Process
1. Class Definition
Because the attribute is dynamically added, the attribute-value pair is also dynamically added, and the class must containupdateAttributes()
AndupdatePairs()
Two member functions are available, and the list is used.attributes
Storage attribute, Dictionaryattrilist
Storage ing. Whereinit()
The function is a constructor.__attributes
An underscore (_) indicates a private variable and cannot be directly called outside. Onlya=UserInfo()
No parameters are required.
class UserInfo(object): 'Class to restore UserInformation' def __init__ (self): self.attrilist={} self.__attributes=[] def updateAttributes(self,attributes): self.__attributes=attributes def updatePairs(self,values): for i in range(len(values)): self.attrilist[self.__attributes[i]]=values[i]
2. Use the generator to dynamically update each object and return the object
The generator is equivalent to a function that only needs to be initialized once and can automatically run multiple times. A result is returned in each loop. But function usagereturn
Returns the result, while the generator usesyield
Returned results. Every time you runyield
Back, next run fromyield
Start later. For example, we implement the Fibonacci series using functions and generators:
def fib(max): n, a, b = 0, 0, 1 while n < max: print(b) a, b = b, a + b n = n + 1 return 'done'
We calculate the first six numbers of the series:
>>> fib(6)112358'done'
If you use a generator, you only needprint
Changeyield
You can. As follows:
def fib(max): n, a, b = 0, 0, 1 while n < max: yield b a, b = b, a + b n = n + 1
Usage:
>>> f = fib(6)>>> f<generator object fib at 0x104feaaa0>>>> for i in f:... print(i)... 112358>>>
As you can see, the generator fib itself is an object, and each execution of yield will interrupt the return of a result, and the next operation will continue fromyield
The next line of code continues to be executed. The generator can also be usedgenerator.next()
Run.
In my program, some code of the generator is as follows:
def ObjectGenerator(maxlinenum): filename='/home/thinkit/Documents/usr_info/USER.csv' attributes=[] linenum=1 a=UserInfo() file=open(filename) while linenum < maxlinenum: values=[] line=str.decode(file.readline(),'gb2312')#linecache.getline(filename, linenum,'gb2312') if line=='': print'reading fail! Please check filename!' break str_list=line.split(',') for item in str_list: item=item.strip() item=item.strip('\"') item=item.strip('\'') item=item.strip('+0*') item=catchTime(item) if linenum==1: attributes.append(item) else: values.append(item) if linenum==1: a.updateAttributes(attributes) else: a.updatePairs(values) yield a.attrilist #change to ' a ' to use linenum = linenum +1
Where,a=UserInfo()
For ClassUserInfo
Because the document is gb2312 encoded, the corresponding decoding method is used above. Since the first row is an attribute, a function stores the attribute listUserInfo
Medium, that isupdateAttributes();
The following row reads the attribute-value pair into a dictionary for storage.p.s.python
The dictionary in is equivalent to map ).
3. Remove unnecessary characters Using strip
From the code above, we can see thatstr.strip(somechar)
You can removesomechar
Character.somechar
It can be a symbol or a regular expression, as shown above:
Item = item. strip () # Remove all escape characters before and after the string, such as \ t, \ n, and other item = item. strip ('\ "') # Remove" item = item. strip ('\ '') item = item. strip ('+ 0 *') # Remove the plus and minus + 00... 00, * indicates that the number of 0 values can be any number or no
4. re. match string
Function Syntax:
re.match(pattern, string, flags=0)
Function parameter description:
Parameter description
Regular Expression for pattern Matching
The string to be matched.
Flags are used to control the matching mode of regular expressions, such as case-sensitive or multi-row matching.
If the re. match method is successful, a matching object is returned. Otherwise, None is returned. '
>>> S = '2017-09-18'
>>> MatchObj = re. match (R' \ d {4}-\ d {2}-\ d {2} ', s, flags = 0)
>>> Print matchObj
<_ Sre. SRE_Match object at 0x7f3525480f38>
1
2
3
4
5
5. Use time. strptime to extract the string and convert it into a time object.
Intime
Module,time.strptime(str,format)
You can setstr
Followformat
Format to time object,format
Common formats include:
% Y two-digit year (00-99)
% Y indicates the four-digit year (000-9999)
% M month (01-12)
One day in % d month (0-31)
% H hour in 24-hour format (0-23)
% I 12-hour (01-12)
% M minutes (00 = 59)
% S seconds (00-59)
In additionre
The module matches the string with a regular expression to see if it is in normal time format, as shown inYYYY/MM/DD H:M:S, YYYY-MM-DD
And so on
In the code above, the catchTime function is used to determine whether an item is a time object. If yes, it is converted to a time object.
The Code is as follows:
import timeimport redef catchTime(item): # check if it's time matchObj=re.match(r'\d{4}-\d{2}-\d{2}',item, flags= 0) if matchObj!= None : item =time.strptime(item,'%Y-%m-%d') #print "returned time: %s " %item return item else: matchObj=re.match(r'\d{4}/\d{2}/\d{2}\s\d+:\d+:\d+',item,flags=0 ) if matchObj!= None : item =time.strptime(item,'%Y/%m/%d %H:%M:%S') #print "returned time: %s " %item return item
Complete code:
Import collectionsimport timeimport reclass UserInfo (object): 'class to restore userinformation' def _ init _ (self): self. attrilist = collections. orderedDict () # ordered self. _ attributes = [] def updateAttributes (self, attributes): self. _ attributes = attributes def updatePairs (self, values): for I in range (len (values): self. attrilist [self. _ attributes [I] = values [I] def catchTime (item): # check if it's ti Me matchObj = re. match (R' \ d {4}-\ d {2}-\ d {2} ', item, flags = 0) if matchObj! = None: item = time. strptime (item, '% Y-% m-% D') # print "returned time: % s" % item return item else: matchObj = re. match (R' \ d {4}/\ d {2}/\ d {2} \ s \ d +: \ d + ', item, flags = 0) if matchObj! = None: item = time. strptime (item, '% Y/% m/% d % H: % M: % s') # print "returned time: % S" % item return itemdef ObjectGenerator (maxlinenum): filename = '/home/thinkit/Documents/usr_info/USER.csv' attributes = [] linenum = 1 a = UserInfo () file = open (filename) while linenum <maxlinenum: values = [] line = str. decode (file. readline (), 'gb2312') # linecache. getline (filename, linenum, 'gb2312') if line = '': print 'reading fail! Please check filename! 'Break str_list = line. split (',') for item in str_list: item = item. strip () item = item. strip ('\ "') item = item. strip ('\ '') item = item. strip ('+ 0 *') item = catchTime (item) if linenum = 1: attributes. append (item) else: values. append (item) if linenum = 1:. updateAttributes (attributes) else:. updatePairs (values) yield. attrilist # change to 'A' to use linenum = linenum + 1if _ name _ = '_ main _': for n in ObjectGenerator (10 ): print n # output dictionary to see if it is correct
Summary
The above is all about this article. I hope it will help you in your study or work. If you have any questions, please leave a message. Thank you for your support for the help house.