File and directory operation implementation code in Python

Source: Internet
Author: User
Tags glob
This article will explain in detail how these functions are used. First, we describe the listing file functionality for the dir command in the Python language similar to the Windows system, and then describe how to test whether a file name corresponds to a standard file, directory, or link, and how to extract the file size and date. After that, we'll also show you how to delete files and directories, how to copy and delete files, and how to break down a complete file path into a directory section and a file name section, and finally, we'll walk through the creation of catalogs and how to move directories and manipulate files in the directory tree.
First, display the contents of the directory
When we want to list all files with a. jpg or. gif extension in the current directory, you can use the Glob module to accomplish this task, as shown here:
Import Glob
FileList = Glob.glob (' *.jpg ') + glob.glob (' *.gif ')
The code above uses the Glob function, which is the type of file to be displayed. Here, the file type is specified by some file names that resemble the Unix operating system shell style wildcard descriptor. The use of these wildcard characters, please refer to the Fnmatch module documentation, there are specific instructions and examples.
In order to display all the files in a directory, you can use the Os.listdir function as shown below:

The code is as follows:


Files = Os.listdir (R ' C:\hpl\scripting\src\py\intro ') #适用于 Windows
Files = Os.listdir ('/home/hpl/scripting/src/py/intro ') # for UNIX
# Cross-platform version:
Files = Os.listdir (Os.path.join (os.environ[' scripting '),
' src ', ' py ', ' intro ')
Files = Os.listdir (os.curdir) # All files in the current directory
Files = Glob.glob (' * ') + glob.glob ('. * ')


ii. types of test files
We know that filenames, directory names, and link names use a string as their identifier, but given us an identifier, how do we determine whether it refers to a regular file filename, directory name, or link name? At this point, we can use the Isfile function provided by the Os.path module, The Isdir function and the Islink function to achieve our goal are as follows:
  

The code is as follows:


Print MyFile, ' is a ',
If Os.path.isfile (myfile):
print ' Plain file '
If Os.path.isdir (myfile):
print ' directory '
If Os.path.islink (myfile):
print ' link '


You can also find the date and size of the file:

The code is as follows:


time_of_last_access = Os.path.getatime (myfile)
Time_of_last_modification = Os.path.getmtime (myfile)
Size = Os.path.getsize (myfile)


The time here is in seconds and starts from January 1, 1970. In order to get the last access date in days, you can use the following code:
Import Time # Time.time () returns the current date
Age_in_days = (Time.time ()-time_of_last_access)/(60*60*24)
In order to obtain the details of the file, you can use the Os.stat function and other utilities in the stat module to achieve the purpose, as follows:

The code is as follows:


Import Stat
Myfile_stat = Os.stat (myfile)
Size = Myfile_stat[stat. St_size]
mode = Myfile_stat[stat. St_mode]
If Stat. S_isreg (Mode):
print '% (myfile) is a regular file with a size of% d bytes '%\
VARs ()


For more information about the stat module, see Python Library Reference. To test the read, write, and execute permissions for a file, you can use the Os.access function, as follows:
If Os.access (myfile, OS. W_OK):
Print myfile, ' have Write permission '
If Os.access (myfile, OS. R_OK | Os. W_OK | Os. X_OK):
Print myfile, ' have read, write, and Execute permissions '
Test code like the above is very useful for CGI scripts.
iii. deletion of files and directories
To delete a single file, you can use the Os.remove function, for example: Os.remove (' Mydata.dat '). The Os.remove alias is Os.unlink, but the latter has the same name as the traditional UNIX operating system and the functions of the purged files in Perl. We can use the following methods to delete a set of files, such as all files with the. jpg and *.gif extensions:
For file in Glob.glob (' *.jpg ') + glob.glob (' *.gif '):
Os.remove (file)
As you know, we can use the rmdir command to delete the directory only if the contents of the directory have been emptied. However, we often want to delete a directory tree with many files, and we can use the Rmtree function provided by the Shutil module as follows:
Shutil.rmtree (' Mydir ')
It is equivalent to the command RM-RF Mydir in the UNIX operating system.
We can create a custom function that treats files and directories as if they were being deleted, typically using the following example:
Remove (' My.dat ') #删除当个文件my. dat
Remove (' mytree ') #删除单个目录树 mytree
# Delete multiple files/trees by name in the string list:
Remove (Glob.glob (' *.tmp ') + glob.glob (' *.temp '))
Remove ([' My.dat ', ' mydir ', ' yourdir '] + glob.glob (' *.data '))
The following is the implementation of the Remove function:
def remove (Files):
"" Deletes one or more files and/or directories. """
If isinstance (Files, str): # files is a string?
Files = [Files] # Convert files from string to list
If not isinstance (Files, list): # files is not a list?
  
For file in Files:
If Os.path.isdir (file):
Shutil.rmtree (file)
Elif os.path.isfile (file):
Os.remove (file)
Let's test the flexibility of the Remove function:

The code is as follows:


# Create 10 directories tmp_*, and 10 files tmp__*:
For I in range (10):
Os.mkdir (' Tmp_ ' +str (i))
f = open (' tmp__ ' +str (i), ' W '); F.close ()
Remove (' tmp_1 ') # Tmp_1 for directory
Remove (Glob.glob (' tmp_[0-9] ') + Glob.glob (' tmp__[0-9] '))


As a note of the Remove function implementation above, we tested the following:
If not isinstance (Files, list):
It's actually too harsh. All we need is a sequence of file/directory names that are traversed. In fact, we don't care if the name is stored in a list, tuple, or array of values, so a better test should look like this:
If not Operator.issequencetype (files):
  
Iv. copying and renaming of files
When we want to copy the file, we can use the Shutil module:
Import Shutil
Shutil.copy (MyFile, Tmpfile)
#拷贝最后访问时间和最后修改时间:
Shutil.copy2 (MyFile, Tmpfile)
# Copy a directory tree:
Shutil.copytree (Root_of_tree, Destination_dir, True)
The third parameter of Copytree specifies the processing of symbolic links, where true means preserving symbolic links, while false means replacing symbolic links with physical copies of files.
The Python language provides a good way to support cross-platform composition of pathname: Os.path.join can join directories and file names using the correct delimiter (used in UNIX and Mac OS X operating systems/, using \ on Windows). The variables Os.curdir and Os.pardir represent the current working directory and its parent directory, respectively. Unix operating system commands like the following
CP http://www.bitsCN.com/f1.c.
You can use the Python language to provide a cross-platform implementation:
Shutil.copy (Os.path.join (Os.pardir,os.pardir, ' f1.c '), Os.curdir)
The rename function in the OS module is often used to rename a file:
Os.rename (myfile, ' TMP.1 ') # rename myfile to ' TMP.1 '
This function can also be used to move files within the same file system. Here, we move the myfile to directory D below:
Os.rename (MyFile, Os.path.join (d, myfile))
When moving files across the file system, you can use Shutil.copy2 to copy the files before deleting the original copy, as follows:
Shutil.copy2 (MyFile, Os.path.join (d, myfile))
Os.remove (myfile)
The latter method of moving files is the safest.
five, decomposition path name
Suppose we use the variable fname to hold a file name that contains the full path, for example:
/usr/home/hpl/scripting/python/intro/hw.py
Sometimes, we need to split such a file path into the base name hw.py and directory name/usr/home/hpl/scripting/python/intro. In the Python language, you can use the following code to achieve your goal:
basename = Os.path.basename (fname)
DirName = Os.path.dirname (fname)
# or
DirName, basename = Os.path.split (fname)
The extension is extracted through the Os.path.splitext function,
root, Extension = Os.path.splitext (fname)
In this way, the extension part of the fname, the. Py is assigned to the variable extension, and the remainder is assigned to the variable root. If you want an extension without a dot, just use Os.path.splitext (fname) [1][1:].
Suppose a file is named F, its extension is arbitrary, if you want to change its extension to ext, you can use the following code:
NewFile = Os.path.splitext (f) [0] + ext
The following is a concrete example:
>>> f = '/some/path/case2.data_source '
>>> moviefile = Os.path.basename (Os.path.splitext (f) [0] + '. mpg ')
>>> Moviefile
' Case2.mpg '
Vi. creation and movement of catalogs
The function mkdir in the OS module can be used to create the directory, while the CHDIR function can move the directory, as follows:
Origdir = OS.GETCWD () # Write down the current position
Newdir = Os.path.join (Os.pardir, ' Mynewdir ')
If not Os.path.isdir (newdir):
Os.mkdir (newdir) # or Os.mkdir (newdir, ' 0755 ')
Os.chdir (Newdir)
...
Os.chdir (Origdir) # Return to the original directory
Os.chdir (os.environ[' home ') # Move to home directory
Suppose we want to create a new directory in our own home directory Py/src/test1, but currently the PY, SRC, and test1 do not exist. If you use the mkdir command to create it, you need to use it three times to build the nested directory, but with the os.makedirs command provided in the Python language, there is no need for such a hassle, the command can build the entire directory at once:
Os.makedirs (Os.path.join (os.environ[' HOME '), ' py ', ' src ', ' test1 ')
VII. traversal of the directory tree
The following function calls
Os.path.walk (Root, MyFunc, Arg)
The root tree is traversed, and then MyFunc (ARG, dirname, files) is called for each directory name dirname, where the parameter files are the list of file names in Dir (which can be obtained by calling Os.listdir (dirname)); ARG is a parameter that the user passes from the calling code. For UNIX operating system users, cross-platform Os.path.walk in the Python language is equivalent to the UNIX command find.
When explaining the use of os.path.walk, people often use the name of a file written in all subdirectories in the home directory as an example. Of course, we can also use the following code snippet in an interactive python command line to realize the use of Os.path.walk:
def ls (ARG, dirname, files):
Print DirName, ' has the files ', files
Os.path.walk (os.environ[' HOME '), LS, None)
In this case, the parameter arg is not required, so let it be a value of none in the Os.path.walk call.
In order to list all files larger than 1Mb in the home directory, you can use the following code:
def checksize1 (ARG, dirname, files):
For file in Files:
filepath = Os.path.join (dirname, file)
If Os.path.isfile (filepath):
Size = Os.path.getsize (filepath)
If size > 1000000:
SIZE_IN_MB = size/1000000.0
Arg.append ((SIZE_IN_MB, filename))
Bigfiles = []
root = os.environ[' HOME '
Os.path.walk (Root, Checksize1, bigfiles)
For size, name in Bigfiles:
Print name, ' size is ', size, ' Mb '
Now, we use ARG to create a data structure, a list of 2 tuples, where each 2-tuple holds the size (in megabytes) of the file and the full file path. If you want to change arg in a function call for all directories, then Arg must be a mutable data structure that allows modifications to be made appropriately.
The parameter dirname is the absolute path to the directory that is currently being accessed, and the file name inside the parameter files is relative to the dirname path. During this time, the current working directory has not changed, which means that the script remains in the directory where the script starts. That's why we need to make filepath an absolute path with dirname and file. To change the current working directory to dirname, simply call Os.chdir (dirname) in the function that calls Os.path.walk for each directory, and then call Os.chdir again at the end of the function (dirname) Change the current working directory back to the original value, as follows:
def somefunc (ARG, dirname, files):
Origdir = OS.GETCWD (); Os.chdir (dirname)
  
Os.chdir (Origdir)
Os.path.walk (Root, SomeFunc, Arg)
Of course, you can also write code with similar functionality to replace Os.path.walk if you prefer. The following code calls the custom function for each file rather than for each directory, as follows:
def find (func, RootDir, Arg=none):
# call Func for each file in the RootDir directory
Files = Os.listdir (RootDir) # gets all the files in the RootDir directory
Files.sort (Lambda A, b:cmp (A.lower (), B.lower ()))
For file in Files:
FullPath = Os.path.join (rootdir, file)
If Os.path.islink (FullPath):
Pass
Elif Os.path.isdir (FullPath):
Find (func, FullPath, Arg)
Elif Os.path.isfile (FullPath):
Func (FullPath, Arg)
Else
print ' Find:cannot treat ', FullPath
The above function find can be obtained from the Scitools module. Instead of the built-in function os.path.walk, our find function accesses files and directories in a case-sensitive alphabetical order.
We can use the Find function to list all files larger than 1Mb:
def checksize2 (FullPath, bigfiles):
Size = Os.path.getsize (FullPath)
If size > 1000000:
Bigfiles.append ('%.2fmb%s '% (size/1000000.0, FullPath))
Bigfiles = []
root = os.environ[' HOME '
Find (Checksize2, root, Bigfiles)
For FileInfo in Bigfiles:
Print FileInfo
The parameter arg provides great flexibility. We can use it to store both the input data and the resulting data structure. The next example collects the file name and size of all files with the specified extension that are larger than a certain size. The results of the output are arranged according to the file size.
Bigfiles = {' FileList ': [], # file name and size list
' Extensions ': ('. *ps ', '. Tiff ', '. bmp '),
' Size_limit ': 1000000, # 1 Mb
}
Find (Checksize3, os.environ[' HOME ", bigfiles)
def checksize3 (FullPath, ARG):
Treat_file = False
ext = Os.path.splitext (FullPath) [1]
Import Fnmatch # Unix shell-style wildcard match
For s in arg[' extensions ']:
If Fnmatch.fnmatch (ext, s):
Treat_file = True # FullPath with the correct extension
Size = Os.path.getsize (FullPath)
If treat_file and size > arg[' size_limit ']:
size = '%.2FMB '% (size/1000000.0) # Print
arg[' FileList '].append ({' Size ': size, ' name ': FullPath})
# Arrange files by size
Def Filesort (A, B):
Return CMP (Float (a[' size '][:-2]), float (b[' size '][:-2]))
bigfiles[' filelist '].sort (filesort)
bigfiles[' FileList '].reverse ()
For FileInfo in bigfiles[' FileList ']:
Print fileinfo[' name '], fileinfo[' size ']
Note that for a list of functions, each element in the bigfiles[' filelist ' function is a dictionary, and the key size holds a string, but we must remove the unit MB (the last two characters) and convert it to a floating-point number before comparing it.
Viii. Summary
The processing of files and directories can be done through operating system commands, but the Python language provides many built-in functions for working with files and directories in order for developers to handle related work programmatically. It is important that these functions are used in exactly the same way, whether on Unix, Windows, or Macintosh platforms. This article explains in detail how these functions are used, in which we first describe the function of displaying the contents of a directory, and then describe how to test whether a file name corresponds to a standard file, a directory or a link, and a method for extracting the file size and date. After that, we'll also show you how to delete files and directories, how to copy and delete files, and how to break down a complete file path into a directory section and a file name section, and finally, we'll walk through the creation of catalogs and how to move directories and manipulate files in the directory tree.
  • Related Article

    Contact Us

    The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

    If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

    A Free Trial That Lets You Build Big!

    Start building with 50+ products and up to 12 months usage for Elastic Compute Service

    • Sales Support

      1 on 1 presale consultation

    • After-Sales Support

      24/7 Technical Support 6 Free Tickets per Quarter Faster Response

    • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.