Add attributes and generate objects dynamically in python classes.

Source: Internet
Author: User

Add attributes and generate objects dynamically in python classes.

This article will solve the problem one by one through the following aspects:

1. Main Functions of the program

2. Implementation Process

3. Class Definition

4. Use the generator to dynamically update each object and return the object

5. Remove unnecessary characters Using strip

6. rematch matching strings

7. Use timestrptime to extract strings and convert them into time objects

8. complete code

Main functions of the program

Now there is a document that stores user information like a table: the first row is an attribute. Each attribute is separated by a comma (,). Each row is the value corresponding to each attribute starting from the second row, each row represents a user. How can I read this document and output a user object in each row?
There are also four small requirements:

Each document is very large. If so many objects generated by all rows are saved as a list at a time, the memory will crash. The program can only store objects generated by one row at a time.

Each string separated by commas (,) may have double quotation marks (") or single quotation marks ('), for example," Zhang San ". You must remove the quotation marks. If it is a number, there are more than 000000001.24 characters, remove both ++ and 0 and extract 1.24

The document contains time, which may be or 2:23:56. Convert the string to the time type.

There are many such documents with different attributes. For example, this is the user information and the call record. Which of the following attributes of the class must be dynamically generated based on the first line of the document?

Implementation Process

1. Class Definition

Because the attribute is dynamically added, the attribute-value pair is also dynamically added, and the class must containupdateAttributes()AndupdatePairs()Two member functions are available, and the list is used.attributesStorage attribute, DictionaryattrilistStorage ing. Whereinit()The function is a constructor.__attributesAn underscore (_) indicates a private variable and cannot be directly called outside. Onlya=UserInfo()No parameters are required.

class UserInfo(object): 'Class to restore UserInformation' def __init__ (self):  self.attrilist={}  self.__attributes=[] def updateAttributes(self,attributes):  self.__attributes=attributes def updatePairs(self,values):  for i in range(len(values)):   self.attrilist[self.__attributes[i]]=values[i]

2. Use the generator to dynamically update each object and return the object

The generator is equivalent to a function that only needs to be initialized once and can automatically run multiple times. A result is returned in each loop. But function usagereturn Returns the result, while the generator usesyield Returned results. Every time you runyieldBack, next run fromyieldStart later. For example, we implement the Fibonacci series using functions and generators:

def fib(max): n, a, b = 0, 0, 1 while n < max:  print(b)  a, b = b, a + b  n = n + 1 return 'done'

We calculate the first six numbers of the series:

>>> fib(6)112358'done'

If you use a generator, you only needprint Changeyield You can. As follows:

def fib(max): n, a, b = 0, 0, 1 while n < max:  yield b  a, b = b, a + b  n = n + 1

Usage:

>>> f = fib(6)>>> f<generator object fib at 0x104feaaa0>>>> for i in f:...  print(i)... 112358>>> 

As you can see, the generator fib itself is an object, and each execution of yield will interrupt the return of a result, and the next operation will continue fromyieldThe next line of code continues to be executed. The generator can also be usedgenerator.next()Run.

In my program, some code of the generator is as follows:

def ObjectGenerator(maxlinenum): filename='/home/thinkit/Documents/usr_info/USER.csv' attributes=[] linenum=1 a=UserInfo() file=open(filename) while linenum < maxlinenum:  values=[]  line=str.decode(file.readline(),'gb2312')#linecache.getline(filename, linenum,'gb2312')  if line=='':   print'reading fail! Please check filename!'   break  str_list=line.split(',')  for item in str_list:   item=item.strip()   item=item.strip('\"')   item=item.strip('\'')   item=item.strip('+0*')   item=catchTime(item)   if linenum==1:    attributes.append(item)   else:    values.append(item)  if linenum==1:   a.updateAttributes(attributes)  else:   a.updatePairs(values)   yield a.attrilist #change to ' a ' to use  linenum = linenum +1

Where,a=UserInfo()For ClassUserInfoBecause the document is gb2312 encoded, the corresponding decoding method is used above. Since the first row is an attribute, a function stores the attribute listUserInfoMedium, that isupdateAttributes();The following row reads the attribute-value pair into a dictionary for storage.p.s.pythonThe dictionary in is equivalent to map ).

3. Remove unnecessary characters Using strip

From the code above, we can see thatstr.strip(somechar)You can removesomecharCharacter.somecharIt can be a symbol or a regular expression, as shown above:

Item = item. strip () # Remove all escape characters before and after the string, such as \ t, \ n, and other item = item. strip ('\ "') # Remove" item = item. strip ('\ '') item = item. strip ('+ 0 *') # Remove the plus and minus + 00... 00, * indicates that the number of 0 values can be any number or no

4. re. match string

Function Syntax:

re.match(pattern, string, flags=0)

Function parameter description:

Parameter description

Regular Expression for pattern Matching

The string to be matched.

Flags are used to control the matching mode of regular expressions, such as case-sensitive or multi-row matching.

If the re. match method is successful, a matching object is returned. Otherwise, None is returned. '

>>> S = '2017-09-18'
>>> MatchObj = re. match (R' \ d {4}-\ d {2}-\ d {2} ', s, flags = 0)
>>> Print matchObj
<_ Sre. SRE_Match object at 0x7f3525480f38>
1
2
3
4
5

5. Use time. strptime to extract the string and convert it into a time object.

IntimeModule,time.strptime(str,format)You can setstrFollowformatFormat to time object,formatCommon formats include:

% Y two-digit year (00-99)

% Y indicates the four-digit year (000-9999)

% M month (01-12)

One day in % d month (0-31)

% H hour in 24-hour format (0-23)

% I 12-hour (01-12)

% M minutes (00 = 59)

% S seconds (00-59)

In additionreThe module matches the string with a regular expression to see if it is in normal time format, as shown inYYYY/MM/DD H:M:S, YYYY-MM-DDAnd so on

In the code above, the catchTime function is used to determine whether an item is a time object. If yes, it is converted to a time object.

The Code is as follows:

import timeimport redef catchTime(item): # check if it's time matchObj=re.match(r'\d{4}-\d{2}-\d{2}',item, flags= 0) if matchObj!= None :  item =time.strptime(item,'%Y-%m-%d')  #print "returned time: %s " %item  return item else:  matchObj=re.match(r'\d{4}/\d{2}/\d{2}\s\d+:\d+:\d+',item,flags=0 )  if matchObj!= None :   item =time.strptime(item,'%Y/%m/%d %H:%M:%S')   #print "returned time: %s " %item  return item

Complete code:

Import collectionsimport timeimport reclass UserInfo (object): 'class to restore userinformation' def _ init _ (self): self. attrilist = collections. orderedDict () # ordered self. _ attributes = [] def updateAttributes (self, attributes): self. _ attributes = attributes def updatePairs (self, values): for I in range (len (values): self. attrilist [self. _ attributes [I] = values [I] def catchTime (item): # check if it's ti Me matchObj = re. match (R' \ d {4}-\ d {2}-\ d {2} ', item, flags = 0) if matchObj! = None: item = time. strptime (item, '% Y-% m-% D') # print "returned time: % s" % item return item else: matchObj = re. match (R' \ d {4}/\ d {2}/\ d {2} \ s \ d +: \ d + ', item, flags = 0) if matchObj! = None: item = time. strptime (item, '% Y/% m/% d % H: % M: % s') # print "returned time: % S" % item return itemdef ObjectGenerator (maxlinenum): filename = '/home/thinkit/Documents/usr_info/USER.csv' attributes = [] linenum = 1 a = UserInfo () file = open (filename) while linenum <maxlinenum: values = [] line = str. decode (file. readline (), 'gb2312') # linecache. getline (filename, linenum, 'gb2312') if line = '': print 'reading fail! Please check filename! 'Break str_list = line. split (',') for item in str_list: item = item. strip () item = item. strip ('\ "') item = item. strip ('\ '') item = item. strip ('+ 0 *') item = catchTime (item) if linenum = 1: attributes. append (item) else: values. append (item) if linenum = 1:. updateAttributes (attributes) else:. updatePairs (values) yield. attrilist # change to 'A' to use linenum = linenum + 1if _ name _ = '_ main _': for n in ObjectGenerator (10 ): print n # output dictionary to see if it is correct

Summary

The above is all about this article. I hope it will help you in your study or work. If you have any questions, please leave a message. Thank you for your support for the help house.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.