How to use Python gracefully to implement file recursive traversal

Source: Internet
Author: User

Today, a script needs to traverse to get all the files under a specified folder, I remember the file traversal and directory Traversal was also implemented earlier, so look for a look, hey, do not see, see a fright, the original before I actually used so rubbing the realization.

First send out to see:

def getallfiles(dir):"""遍历获取指定文件夹下面所有文件"""    if os.path.isdir(dir):        filelist = os.listdir(dir)        for ret in filelist:            filename = dir + "\\" + ret            if os.path.isfile(filename):                print filenamedef getalldirfiles(dir, basedir):"""遍历获取所有子文件夹下面所有文件"""    if os.path.isdir(dir):        getallfiles(dir)        dirlist = os.listdir(dir)        for dirret in dirlist:            fullname = dir + "\\" + dirret            if os.path.isdir(fullname):                getalldirfiles(fullname, basedir)

I used 2 functions, and each function used a listdir, just once to filter the file, once to filter the folder, if only from the functional implementation of the view, a little problem, but this ... It's not elegant.

Start optimization, Scenario one:

def getallfiles(dir):"""使用listdir循环遍历"""    if not os.path.isdir(dir):        print dir        return    dirlist = os.listdir(dir)    for dirret in dirlist:        fullname = dir + "\\" + dirret        if os.path.isdir(fullname):            getallfiles(fullname)        else:            print fullname

As you can see, I've merged two functions into one, called only once, listdir the files and folders with if~else~, and of course, the loop of self-invocation is still there.

Is there a better way to have wood? Online Search a lot of, originally there is a ready-made os.walk () function can be used to handle the file (clip) traversal, so the optimization is easier.

Scenario Two:

def getallfilesofwalk(dir):"""使用listdir循环遍历"""    if not os.path.isdir(dir):        print dir        return    dirlist = os.walk(dir)    for root, dirs, files in dirlist:        for file in files:            print os.path.join(root, file)

Just from the implementation of the Code, the scheme is the most elegant and concise, but then look at Os.walk () to realize the source code will find, in fact, its internal or call Listdir to complete the implementation of the specific functions, but it is the output of the additional processing of the results.

Attach the source code of Os.walk ():

 from os.path import join, Isdir, islink# We are not having read permission for top, in WH  Ich case we can ' t# get a list of the files the directory contains.  os.path.walk# always suppressed the exception then, rather than blow up for a# minor reason when (say) a thousand readable  Directories is still# left to visit.    That logic was copied here.try: # Note that Listdir and error was globals in the This module due # to earlier import-*. Names = Listdir (top) except error, Err:if onerror is not none:onerror (err) returndirs, nondirs = [], []FO    R name in Names:if Isdir (join (top, name)): Dirs.append (name) else:nondirs.append (name) if Topdown: Yield top, dirs, nondirsfor name in Dirs:path = Join (top, name) if Followlinks or not Islink (path): For x i n Walk (path, Topdown, OnError, followlinks): Yield xif not Topdown:yield top, dirs, Nondirs  

As for the difference between Listdir and walk in the output, the main thing is Listdir by default is the file and folder in alphabetical order to output, and walk is to first output the top-level folder, then the top-level file, then output the second level folder, and the second level file, and so on , you can copy the above script and verify it yourself.

Above, if feel useful, please help forward to share, not very grateful.

How to use Python gracefully to implement file recursive traversal

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.