Today, a script needs to traverse to get all the files under a specified folder, I remember the file traversal and directory Traversal was also implemented earlier, so look for a look, hey, do not see, see a fright, the original before I actually used so rubbing the realization.
First send out to see:
def getallfiles(dir):"""遍历获取指定文件夹下面所有文件""" if os.path.isdir(dir): filelist = os.listdir(dir) for ret in filelist: filename = dir + "\\" + ret if os.path.isfile(filename): print filenamedef getalldirfiles(dir, basedir):"""遍历获取所有子文件夹下面所有文件""" if os.path.isdir(dir): getallfiles(dir) dirlist = os.listdir(dir) for dirret in dirlist: fullname = dir + "\\" + dirret if os.path.isdir(fullname): getalldirfiles(fullname, basedir)
I used 2 functions, and each function used a listdir, just once to filter the file, once to filter the folder, if only from the functional implementation of the view, a little problem, but this ... It's not elegant.
Start optimization, Scenario one:
def getallfiles(dir):"""使用listdir循环遍历""" if not os.path.isdir(dir): print dir return dirlist = os.listdir(dir) for dirret in dirlist: fullname = dir + "\\" + dirret if os.path.isdir(fullname): getallfiles(fullname) else: print fullname
As you can see, I've merged two functions into one, called only once, listdir the files and folders with if~else~, and of course, the loop of self-invocation is still there.
Is there a better way to have wood? Online Search a lot of, originally there is a ready-made os.walk () function can be used to handle the file (clip) traversal, so the optimization is easier.
Scenario Two:
def getallfilesofwalk(dir):"""使用listdir循环遍历""" if not os.path.isdir(dir): print dir return dirlist = os.walk(dir) for root, dirs, files in dirlist: for file in files: print os.path.join(root, file)
Just from the implementation of the Code, the scheme is the most elegant and concise, but then look at Os.walk () to realize the source code will find, in fact, its internal or call Listdir to complete the implementation of the specific functions, but it is the output of the additional processing of the results.
Attach the source code of Os.walk ():
from os.path import join, Isdir, islink# We are not having read permission for top, in WH Ich case we can ' t# get a list of the files the directory contains. os.path.walk# always suppressed the exception then, rather than blow up for a# minor reason when (say) a thousand readable Directories is still# left to visit. That logic was copied here.try: # Note that Listdir and error was globals in the This module due # to earlier import-*. Names = Listdir (top) except error, Err:if onerror is not none:onerror (err) returndirs, nondirs = [], []FO R name in Names:if Isdir (join (top, name)): Dirs.append (name) else:nondirs.append (name) if Topdown: Yield top, dirs, nondirsfor name in Dirs:path = Join (top, name) if Followlinks or not Islink (path): For x i n Walk (path, Topdown, OnError, followlinks): Yield xif not Topdown:yield top, dirs, Nondirs
As for the difference between Listdir and walk in the output, the main thing is Listdir by default is the file and folder in alphabetical order to output, and walk is to first output the top-level folder, then the top-level file, then output the second level folder, and the second level file, and so on , you can copy the above script and verify it yourself.
Above, if feel useful, please help forward to share, not very grateful.
How to use Python gracefully to implement file recursive traversal