Dynamically adding properties and generating objects in a Python class

Source: Internet
Author: User
Tags generator generator
This article will be solved by a few aspects.

1, the main function of the program

2. Realization Process

3. Definition of Class

4. Dynamically update each object with the generator generator and return the object

5. Use strip to remove unnecessary characters

6. Rematch Matching string

7. Using Timestrptime to extract strings into time objects

8. Complete code

Main functions of the program

Now there's a table-like document that stores user information: The first line is a property, each property is separated by a comma (,), and the second line starts with a value for each property, and each row represents a user. How do I read this document and output one user object per line?
There are also 4 small requirements:

Each document is large, and memory crashes if you save as many objects as a list of all rows generated at once. Only one row-generated object can be saved at a time in a program.

Each string separated by commas, may have double quotation marks (") or single quotation marks ('), such as" Zhang San ", to remove the quotation marks, if the number, there are +000000001.24 such, to the front of the + and 0 are removed, extract 1.24

There is time in the document, which may be 2013-10-29, or 2013/10/29 2:23:56. To convert such a string to a time type

This kind of document has many, each property is different, for example this is the user's information, that is the call record. So what are the specific attributes in the class that are dynamically generated based on the first line of the document?

Implementation process

1. Definition of class

Because properties are dynamically added, property-value pairs are also dynamically added, with the class containing updateAttributes() and updatePairs() two member functions, and by storing the properties in the list attributes , the dictionary attrilist stores the mappings. Where init() the function is a constructor. The __attributes underscore indicates a private variable and cannot be called directly outside. It can only be instantiated a=UserInfo() with no parameters.

Class UserInfo (object): ' Class to restore UserInformation ' def __init__ (self):  self.attrilist={}  self.__ Attributes=[] def updateattributes (self,attributes):  self.__attributes=attributes def updatepairs (self,values ): For  I in range (Len (values)):   Self.attrilist[self.__attributes[i]]=values[i]

2. Dynamically update each object with the generator (generator) and return the object

The generator is equivalent to a function that can be automatically run multiple times once, and each loop returns a result. However return , the function returns the result, and the generator yield returns the result. Each run is yield returned, and the next run yield starts after. For example, we implement the Fibonacci sequence, using functions and generators, respectively:

def fib (max): N, a, b = 0, 0, 1 while n < max:  print (b)  A, B = B, a + b  n = n + 1 return ' done '

We calculate the first 6 numbers of a series:

>>> fib (6) 112358 ' Done '

If you use a generator, just print change yield it. As follows:

def fib (max): N, a, b = 0, 0, 1 while n < max:  yield b  A, B = B, a + b  n = n + 1

How to use:

>>> f = fib (6) >>> F<generator object fib at 0x104feaaa0>>>> for i in F: ...  Print (i) ... 112358>>>

As you can see, the generator fib itself is an object, and each execution to yield breaks back one result and continues yield the next line of code from the next. The generator can also be used for generator.next() execution.

In my program, the generator section code is as follows:

def objectgenerator (maxlinenum): Filename= '/home/thinkit/documents/usr_info/user.csv ' attributes=[] linenum=1 a= UserInfo () file=open (filename) while linenum < maxlinenum:  values=[]  line=str.decode (File.readline (), ' gb2312 ') #linecache. getline (filename, linenum, ' gb2312 ')  if line== ':   print ' reading fail! Please check filename! '   Break  str_list=line.split (', ') for  item in Str_list:   item=item.strip ()   item=item.strip (' \ "')   Item=item.strip (' \ ')   item=item.strip (' +0* ')   item=catchtime (item)   if linenum==1:    Attributes.append (item)   else:    values.append (item)  if linenum==1:   a.updateattributes ( attributes)  else:   a.updatepairs (values)   yield a.attrilist #change to ' a ' to use  linenum = LineNum + 1

Where the a=UserInfo() class is UserInfo instantiated. Because the document is GB2312 encoded, the corresponding decoding method is used. Because the first row is a property, there is a function to save the property list UserInfo in, that is, the updateAttributes(); next row to read the property-value pairs into a dictionary to store. p.s.pythonthe dictionary in the equivalent map (map).

3. Use strip to remove unnecessary characters

From the code above, you can see str.strip(somechar) the characters that are used to remove Str before and after somechar . somecharcan be a symbol, or it can be a regular expression, as above:

Item=item.strip () #除去字符串前后的所有转义字符, such as \t,\n item=item.strip (' \ "') #除去前后的" Item=item.strip (' \ ') item=item.strip (' +0* ') #除去前后的 +00...00,* indicates that the number of 0 can be any number or no

4.re.match Matching string

function Syntax:

Re.match (Pattern, string, flags=0)

Function parameter Description:

Parameter description

Pattern-matched Regular expression

String to match.

Flags flags that govern how regular expressions are matched, such as case sensitivity, multiline matching, and so on.

If the match succeeds, the Re.match method returns a matching object, otherwise none is returned. `

>>> s= ' 2015-09-18 '
>>> Matchobj=re.match (R ' \d{4}-\d{2}-\d{2} ', S, flags= 0)
>>> Print Matchobj
<_sre. Sre_match Object at 0x7f3525480f38>
1
2
3
4
5

5. Using Time.strptime to extract a string into a time object

In the time module, time.strptime(str,format) You can convert the str format format to a time object, format in the common format is:

%y Two-digit year representation (00-99)

%Y Four-digit year representation (000-9999)

%m Month (01-12)

One day in%d months (0-31)

%H 24-hour hours (0-23)

%I 12-hour hours (01-12)

%M minutes (00=59)

%s seconds (00-59)

In addition, you need to use the re module, with regular expressions, to match the string to see if it is a general time format, such as YYYY/MM/DD H:M:S, YYYY-MM-DD

In the above code, the function catchtime is to determine whether the item is a time object, which translates to a time object.

The code is as follows:

Import Timeimport redef Catchtime (item): # Check if it ' s time Matchobj=re.match (R ' \d{4}-\d{2}-\d{2} ', item, flags= 0) if Ma tchobj!= None:  Item =time.strptime (item, '%y-%m-%d ')  #print "returned time:%s"%item  return Item Else:  Matchobj=re.match (R ' \d{4}/\d{2}/\d{2}\s\d+:\d+:\d+ ', item,flags=0)  if matchobj!= None:   item = Time.strptime (item, '%y/%m/%d%h:%m:%s ')   #print "returned time:%s"%item  return item

Full code:

Import Collectionsimport timeimport Reclass UserInfo (object): ' Class to restore UserInformation ' def __init__ (self): sel F.attrilist=collections. Ordereddict () # ordered self.__attributes=[] def updateattributes (self,attributes): Self.__attributes=attributes def Updatepairs (self,values): For I in range (len values): Self.attrilist[self.__attributes[i]]=values[i]def catchtime ( Item): # Check if it ' s time Matchobj=re.match (R ' \d{4}-\d{2}-\d{2} ', item, flags= 0) if matchobj!= none:item =time.strpti Me (item, '%y-%m-%d ') #print "returned time:%s"%item return item Else:matchobj=re.match (R ' \d{4}/\d{2}/\d{2}\s\d+:\d+: \d+ ', item,flags=0) if matchobj!= none:item =time.strptime (item, '%y/%m/%d%h:%m:%s ') #print "returned time:%s"%i TEM return itemdef objectgenerator (maxlinenum): Filename= '/home/thinkit/documents/usr_info/user.csv ' attributes=[] Linenum=1 A=userinfo () file=open (filename) while LineNum < maxlinenum:values=[] Line=str.decode (File.readline (), ' gb2312 ') #linecachE.getline (filename, linenum, ' gb2312 ') if line== ': print ' reading fail!   Please check filename! '    Break Str_list=line.split (', ') for item in Str_list:item=item.strip () Item=item.strip (' \ "') Item=item.strip (' \ ') Item=item.strip (' +0* ') item=catchtime (item) if Linenum==1:attributes.append (item) Else:values.append (item If Linenum==1:a.updateattributes (attributes) Else:a.updatepairs (values) yield a.attrilist #change to ' a ' to Use LineNum = linenum +1if __name__ = = ' __main__ ': for n in objectgenerator: print n #输出字典 to see if it is correct

Summarize

The above is the whole content of this article, I hope that everyone's study or work to bring certain help, if there are questions you can message exchange, thank you for the support of topic.alibabacloud.com.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.