ETL Application scenario, if the interface file is not provided, the task will be in the loop wait until the peer to provide, the method greatly consumes the system resources. To this end think of a method, one time to obtain a platform file, the realization of the following ideas:
1, the first time to obtain the peer platform to provide the directory under the given date all the interface files, and save the file list;
2, the subsequent restart every n minutes to get the task, each time to get a list of files, and the last list to compare, when the following situation occurs, will be re-acquired:
A, a new document is produced;
B, there is a change in file size
The implementation method is as follows:
[ftp.properties]ipaddress = 10.25.xxx.xxxusername = Xxxxxpassword = xxxxx#\u5f53 encryption \u6709\u503C\u65F6\uFF0C\ U5c06\u8fdb\u884c\u5bc6\u7801\u89e3\u6790encryption = #\u5f53resolve \u4e3a false\u65f6\uff0c\u9700\u8981\u66ff\ U6362\u8fdc\u7a0b\u76ee\u5f55\u548c\u5f53\u524d\u76ee\u5f55\u7684\u53c2\u6570resolve = 1remoteDir =/bosscdr/tobak /jf_basslocaldir =/interface/cyg/[sdt_yyyymmdd]#\u4e0a\u6b21\u4fdd\u5b58\u7684\u6587\u4ef6\u83b7\u53d6\u5217\ U8868lastfilelist =/interface/cyg/lastfilelist.txt
#-*-coding:utf-8-*-" "function Description: Get remote file Write time: 2015-5-5 Author: Chenyangang----------------------------------------Implementation method: 1, get the FTP server information set up in the configuration file, User name is encrypted 2, get a list of files in remote directory, if there is a list of saved files, compare, extract the difference file 3, according to the difference file for file acquisition" "ImportdatetimeImportConfigparserImportOSImportFtplibImportCpickleclassGetdatabasediff (object):def __init__(Self, config, InterfaceID = none, Interfacedate = none, delay =0): Self.config=config Self.interfaceid=InterfaceID#The default is today's date ifInterfacedate = =None:self.interfaceDate= Datetime.date.strftime (Datetime.date.today ()-Datetime.timedelta (delay),"%y%m%d") defGetConfig (Self, interfacedate): Readconfig=Configparser.configparser () with open (Self.config,'R') as CONFIGFILE:READCONFIG.READFP (configfile) hostaddr= Readconfig.get ('ftp.properties','IPAddress') Username= Readconfig.get ('ftp.properties','username') #whether to parse parameters and encryptResolve = Readconfig.get ('ftp.properties','Resolve') Encryption= Readconfig.get ('ftp.properties','Encryption') #directory informationRemotedir = Readconfig.get ('ftp.properties','Remotedir') Localdir= Readconfig.get ('ftp.properties','Localdir') #store last fetch file listLastfilelist = Readconfig.get ('ftp.properties','lastfilelist') ifEncryption = ="': Password= Readconfig.get ('ftp.properties','Password') Else: Command= encryption +' '+ Readconfig.get ('ftp.properties','Password') Password=os.popen (command)ifResolve = ='1': Month= Interfacedate[0:6] Remotedir= Remotedir.replace (r"[SDT_YYYYMMDD]", interfacedate) Remotedir= Remotedir.replace (r"[Sdt_yyyymm]", month) Localdir= Localdir.replace (r"[SDT_YYYYMMDD]", interfacedate) Localdir= Localdir.replace (r"[Sdt_yyyymm]", month)returnhostaddr, username, password, remotedir, Localdir, LastfilelistdefConnect (self, hostaddr, username, password):Try: Connftp=Ftplib. FTP (HOSTADDR)exceptFtplib.error_perm:Print "The IPAddress (IPAddress) refused!"%{'IPAddress': hostaddr}Try: Connftp.login (username, password)exceptFtplib.error_perm:Print "This username (username) refuse Connect, please check your username or password!"%{'username': Username}returnconnftpdefgetfilelist (self, connftp, Remotedir):#get file details, including permissions, file size, owner information, and the 5th item is file sizeconnftp.cwd (remotedir) Filesdetail= Connftp.nlst ('- L') #save file name and sizeFileList = {} forFiledetailinchFilesdetail:filelistfromdetail=Filedetail.strip (). Split () filelist[filelistfromdetail[-1]] = filelistfromdetail[4] returnfileListdefcomparisonfilelist (self, Lastfilelist, newfilelist):#load last file for information ifLen (Open (Lastfilelist,"RB"). ReadLines ()) >0:with Open (lastfilelist,"RB") as FP:Try: Lastfilelist=cpickle.load (FP)exceptEoferror:Print "Load (filename) was failed"%{'filename': Lastfilelist}Else: Lastfilelist={} Lastfileset=Set (Lastfilelist.keys ()) Newfileset=Set (Newfilelist.keys ())#extract List of new filesDifffilelist = List (Newfileset-lastfileset) Samefilename= List (Newfileset &lastfileset)#file list with inconsistent file size before and after extraction forSamefilenameinchSamefilename:ifNewfilelist[samefilename]! =Lastfilelist[samefilename]: difffilelist.append (samefilename)dellastfilelist#Save latest file get listfp = open (Lastfilelist,"WB") Lastfilelist=Cpickle.dump (newfilelist, FP) fp.close ()returndifffilelistdefmachedfilelist (self, difffilelist, InterfaceID, interfacedate):return[Flist forFlistinchDifffilelistifInterfaceIDinchflist andInterfacedateinchFlist]defDownload (self, connftp, Localdir, getfilelist):#go to local directory if notOs.path.isdir (Localdir): Os.makedirs (Localdir)Try: Os.chdir (localdir)except : Print 'dose\ ' t enter the directory, mybe you has not authority!' #get the latest files forRemoteFileinchgetfilelist:Try: Connftp.retrbinary ("RETR%s"%remotefile, open (RemoteFile,"WB"). Write)exceptFtplib.error_perm:Print 'error:cannot Read File "%s"'%remotefile connftp.quit ()if __name__=='__main__': Interfacedate='20150520'InterfaceID=None Getdatabasediff= Getdatabasediff ('./config.properties', interfacedate, 0) hostaddr, username, password, remotedir, Localdir, lastfilelist=getdatabasediff.getconfig (interfacedate) connectionftp=getdatabasediff.connect (hostaddr, username, password) fileList=getdatabasediff.getfilelist (connectionftp, remotedir) difffilelist=getdatabasediff.comparisonfilelist (lastfilelist, fileList)ifInterfaceID is notNone andLen (difffilelist) >0:getfilelist=getdatabasediff.machedfilelist (difffilelist, InterfaceID, interfacedate) getdatabasediff.download (ConnectionF TP, Localdir, getfilelist)Else: Getdatabasediff.download (connectionftp, Localdir, difffilelist)
As above, is the code that you try to write after learning Python. You can modify the configuration file to configure multiple platforms to obtain multiplatform interface data.
ETL Application: A method of acquiring one platform interface file at a time