This time for you to bring Python to make news aggregation project, Python to make news aggregation project note What, the following is the actual case, together to see.
First, the code, and then to analyze it individually:
From nntplib import nntpfrom time import strftime,time,localtimefrom Email import message_from_stringfrom urllib import ur Lopenimport textwrapimport reday = 24*60*60def Wrap (string,max=70): ' ' ' return ' \ n '. Join (Textwrap.wrap (string ) + ' \ n ' class newsagent: ' "' Def init (self): self.sources = [] Self.destinations = [] def a Ddsource (Self,source): Self.sources.append (source) def adddestination (self,dest): Self.destinations.appen D (dest) def distribute (self): items = [] for source in Self.sources:items.extend (source.getit EMS ()) for dest in Self.destinations:dest.receiveItems (items) class Newsitem:def init (self,title,bod Y): Self.title = Title Self.body = Bodyclass nntpsource:def init (self,servername,group,window): s Elf.servername = servername Self.group = Group Self.window = window def getItems (self): start = Lo Caltime (Time ()-Self.wiNdow*day) Date = strftime ('%y%m%d ', start) hour = strftime ('%h%m%s ', start) server = NNTP (Self.serverna Me) ids = Server.newnews (Self.group,date,hour) [1] for ID in ids:lines = server.article (ID) [3] message = message_from_string (' \ n '. Join (lines)) title = Message[' Subject '] BODY = message.g Et_payload () if Message.is_multipart (): BODY = body[0] yield NewsItem (title,body) Server.quit () class Simplewebsource:def init (self,url,titlepattern,bodypattern): Self.url = URL self. Titlepattern = Re.compile (titlepattern) Self.bodypattern = Re.compile (Bodypattern) def getItems (self): TE XT = Urlopen (Self.url). Read () titles = Self.titlePattern.findall (text) bodies = Self.bodyPattern.findall (Tex T) for Title.body in Zip (titles,bodies): Yield NewsItem (title,wrap (body)) class Plaindestination:def Receiveitems (Self,items): For item in Items:print item.title print '-' *len (item.title) Print Item.bodycla SS Htmldestination:def Init (self,filename): self.filename = filename def receiveitems (self,items): OU t = open (Self.filename, ' W ') print >> out, '
This program, first from the overall analysis, the focus is newsagent, its role is to store news sources, store the target address, The source servers (Nntpsource and Simplewebsource) and the classes that write the news (Plaindestination and htmldestination) are then called separately. So I can see from here that Nntpsource is specifically used to get the information on the news server, and Simplewebsource is to get the data on a URL. The role of Plaindestination and Htmldestination is obvious, the former is used to output the obtained content to the terminal, the latter is to write data into the HTML file.
With these analyses, and then looking at the contents of the main program, the main program is to add information sources and output destination addresses to newsagent.
This is really a simple procedure, but the program is layered.
Believe that you have read the case of this article you have mastered the method, more exciting please pay attention to the PHP Chinese network other related articles!
Recommended reading:
How to configure Opencv3+python3
PYTHON3+OPENCV's configuration Tutorial