This article will explain in detail how these functions are used. First, we introduce the listing file features of the Windows system-like dir command in the Python language, and then describe how to test whether a file name corresponds to a standard file, directory, or link, and a way to extract the file size and date. After that, we'll also explain how to delete files and directories, how to copy and delete files, and how to break a complete file path into a directory section and file name section, and finally, we'll talk about the creation of directories and how to move directories and process files in the directory tree.
First, display the contents of the catalog
When we want to list all files in the current directory that have a. jpg or. gif extension, you can use the Glob module to complete this task, as follows:
Import Glob
FileList = Glob.glob (' *.jpg ') + glob.glob (' *.gif ')
The above code uses the Glob function, which has arguments for the type of file to display. Here, the file type is specified by some file names that resemble the Unix operating system shell-style wildcard description. For the use of these wildcard characters, refer to the documentation for the Fnmatch module, where there are specific instructions and examples.
To display all the files in a directory, you can use the Os.listdir function shown below:
Copy Code code as follows:
Files = Os.listdir (R ' C:\hpl\scripting\src\py\intro ') #适用于 Windows
Files = Os.listdir ('/home/hpl/scripting/src/py/intro ') # applies to Unix
# Cross-platform version:
Files = Os.listdir (Os.path.join (os.environ[' scripting '),
' src ', ' py ', ' intro ')
Files = Os.listdir (os.curdir) # Every file in the current directory
Files = Glob.glob (' * ') + glob.glob ('. * ')
Second, test file type
We know that file names, directory names, and link names have a string as their identifier, but given an identifier, how do we determine if it refers to a regular file file name, directory name, or link name? At this point, we can use the Isfile function provided by the Os.path module, Isdir functions and Islink functions to achieve our goals, as follows:
Copy Code code as follows:
Print MyFile, ' is a ',
If Os.path.isfile (myfile):
print ' Plain file '
If Os.path.isdir (myfile):
print ' directory '
If Os.path.islink (myfile):
print ' link '
You can also find the date and size of the file:
Copy Code code as follows:
time_of_last_access = Os.path.getatime (myfile)
Time_of_last_modification = Os.path.getmtime (myfile)
Size = Os.path.getsize (myfile)
The time here is in seconds and starts from January 1, 1970. To get the last access date in days, you can use the following code:
Import Time # Time.time () returns the current times
Age_in_days = (Time.time ()-time_of_last_access)/(60*60*24)
To get the details of the file, you can use the Os.stat function and other utilities in the stat module to achieve the goal, as follows:
Copy Code code as follows:
Import Stat
Myfile_stat = Os.stat (myfile)
Size = Myfile_stat[stat. St_size]
mode = Myfile_stat[stat. St_mode]
If Stat. S_isreg (Mode):
print '% (myfile) is a regular file with a% (size) d-byte '%\
VARs ()
For more information about the stat module, see Python Library Reference. If you want to test the read, write, and execute permissions of a file, you can use the Os.access function, as shown in the following example:
If Os.access (myfile, OS. W_OK):
Print myfile, ' with Write permission '
If Os.access (myfile, OS. R_OK | Os. W_OK | Os. X_OK):
Print MyFile, ' with read, write, and Execute permissions '
Test code like the above is very useful for CGI scripts.
iii. deletion of files and directories
To delete a single file, you can use the Os.remove function, such as: Os.remove (' Mydata.dat '). The alias for Os.remove is Os.unlink, but the latter is the same as the traditional UNIX operating system and the function that clears files in Perl. We can use the following methods to delete a set of files, such as all files with the. jpg and *.gif extensions:
For file in Glob.glob (' *.jpg ') + glob.glob (' *.gif '):
Os.remove (file)
As you know, we can use the rmdir command to delete this directory only if the contents of the catalog have been emptied. However, we often want to delete a directory tree with many files, at which point we can use the Rmtree function provided by the Shutil module, as follows:
Shutil.rmtree (' Mydir ')
It corresponds to the command RM-RF Mydir in the UNIX operating system.
We can create a custom function that treats the file and directory as equals when the deletion takes place, and the typical usage is as follows:
Remove (' My.dat ') #删除当个文件my. dat
Remove (' mytree ') #删除单个目录树 mytree
# Delete multiple file/directory trees by name in the string list:
Remove (Glob.glob (' *.tmp ') + glob.glob (' *.temp '))
Remove ([' My.dat ', ' mydir ', ' yourdir '] + glob.glob (' *.data '))
The following is the implementation of the Remove function:
def remove (Files):
"" Deletes one or more files and/or directories. """
If isinstance (Files, str): # is the files a string?
Files = [Files] # Convert files from string to list
If not isinstance (Files, list): # files is not a list?
For file in Files:
If Os.path.isdir (file):
Shutil.rmtree (file)
Elif os.path.isfile (file):
Os.remove (file)
Let's test the flexibility of the Remove function:
Copy Code code as follows:
# Create 10 directory tmp_*, and 10 documents tmp__*:
For I in range (10):
Os.mkdir (' Tmp_ ' +str (i))
f = open (' tmp__ ' +str (i), ' W '); F.close ()
Remove (' tmp_1 ') # Tmp_1 as Directory
Remove (Glob.glob (' tmp_[0-9] ') + Glob.glob (' tmp__[0-9] ')
As a note of the implementation of the Remove function above, we performed the following tests:
If not isinstance (Files, list):
It was actually too harsh. All we need is a sequence of file/directory names to be traversed. In fact, we don't care whether names are stored in a list, tuple, or array of values, so better tests should look like this:
If not Operator.issequencetype (files):
Iv. Copying and renaming of documents
When we want to copy the files, we can use the Shutil module:
Import Shutil
Shutil.copy (MyFile, Tmpfile)
#拷贝最后访问时间和最后修改时间:
Shutil.copy2 (MyFile, Tmpfile)
# Copy a directory tree:
Shutil.copytree (Root_of_tree, Destination_dir, True)
The third parameter of Copytree prescribes the processing of symbolic links, where true indicates that the symbolic link is preserved, and false means that the physical copy of the file is used instead of the symbolic link.
The Python language can well support the cross-platform composition of pathname: Os.path.join can join directories and file names using the correct delimiters (used in UNIX and Mac OS X operating systems/, on Windows). The variables Os.curdir and Os.pardir represent the current working directory and its parent directory, respectively. Unix operating system commands like the following
CP http://www.jb51.net/f1.c.
You can use the Python language to provide a cross-platform implementation:
Shutil.copy (Os.path.join (Os.pardir,os.pardir, ' f1.c '), Os.curdir)
The rename function in the OS module is typically used to rename a file:
Os.rename (myfile, ' TMP.1 ') # rename myfile to ' TMP.1 '
This function can also be used to move files within the same file system. Here, we move the myfile below the directory D:
Os.rename (MyFile, Os.path.join (d, myfile))
When you move a file across a file system, you can use Shutil.copy2 to copy the file and then delete the original copy, as follows:
Shutil.copy2 (MyFile, Os.path.join (d, myfile))
Os.remove (myfile)
This is the safest way to move a file later.
five, decomposition path name
Let's say we use variable fname to hold a file name that contains the full path, for example:
/usr/home/hpl/scripting/python/intro/hw.py
Sometimes, we need to split this file path into basic name hw.py and directory name/usr/home/hpl/scripting/python/intro. In the Python language, you can use the following code to achieve the purpose:
basename = Os.path.basename (fname)
DirName = Os.path.dirname (fname)
# or
DirName, basename = Os.path.split (fname)
The extension is extracted by the Os.path.splitext function,
root, Extension = Os.path.splitext (fname)
In this way, the extension part of fname, the. Py, is assigned to the variable extension, while the remainder is assigned to the variable root. If you want an extension without a dot number, just use Os.path.splitext (fname) [1][1:].
Suppose a file name is F, its extension is arbitrary, and if you want to change its extension to ext, you can use the following code:
NewFile = Os.path.splitext (f) [0] + ext
The following is a specific example:
>>> f = '/some/path/case2.data_source '
>>> moviefile = Os.path.basename (Os.path.splitext (f) [0] + '. mpg ')
>>> Moviefile
' Case2.mpg '
Vi. creation and movement of catalogs
function mkdir in the OS module can be used to create the directory, while the CHDIR function can move the directory as follows:
Origdir = OS.GETCWD () # Write down the current position
Newdir = Os.path.join (Os.pardir, ' Mynewdir ')
If not Os.path.isdir (newdir):
Os.mkdir (newdir) # or Os.mkdir (newdir, ' 0755 ')
Os.chdir (Newdir)
...
Os.chdir (Origdir) # Return to original directory
Os.chdir (os.environ[' home ') # Move to main directory
Let's say we want to create a new directory Py/src/test1 under our home directory, but the current py, SRC, and test1 are non-existent. If you use the mkdir command to create it, you need to use it three times to build this nested directory, but you don't have to bother with the os.makedirs command provided in the Python language, which can be used to build the entire directory at once:
Os.makedirs (Os.path.join (os.environ[' home ', ' py ', ' src ', ' test1 '))
Seven, traverse the directory tree
The following function calls the
Os.path.walk (Root, MyFunc, Arg)
The root tree is traversed, and MyFunc (ARG, dirname, files) is called for each directory name dirname, where the parameter files are the list of file names in Dir (available by calling Os.listdir (dirname)); ARG is a parameter that the user passes from the calling code. For UNIX operating system users, the Cross-platform os.path.walk in the Python language is equivalent to the UNIX command find.
When explaining the usage of os.path.walk, people often use the name of the file in all subdirectories in the home directory as an example. Of course, we can also use the following snippet in an interactive python command line to realize the use of Os.path.walk:
def ls (ARG, dirname, files):
Print DirName, ' has the files ', files
Os.path.walk (os.environ[' home '), LS, None)
In this case, the parameter arg is not required, so the value in the Os.path.walk call is None.
To list all files larger than 1Mb in the home directory, you can use the following code:
def checksize1 (ARG, dirname, files):
For file in Files:
filepath = Os.path.join (dirname, file)
If Os.path.isfile (filepath):
Size = Os.path.getsize (filepath)
If size > 1000000:
SIZE_IN_MB = size/1000000.0
Arg.append ((SIZE_IN_MB, filename))
Bigfiles = []
root = os.environ[' home ']
Os.path.walk (Root, Checksize1, bigfiles)
For size, name in Bigfiles:
Print name, ' Size to ', size, ' Mb '
Now, we use ARG to create a data structure, which is a list of 2 tuples, each of which holds the size of the file (in megabytes) and the full file path. If you change arg in a function call for all directories, then Arg must be a mutable data structure that allows for proper modification.
The parameter dirname is the absolute path of the directory currently being accessed, and the file name within the parameter files is relative to the dirname path. In the meantime, the current working directory does not change, which means that the script is still in the same directory as the script startup moment. That's why we need to get filepath into the absolute path with dirname and file. To change the current working directory to dirname, simply call Os.chdir (dirname) in the function that calls Os.path.walk for each directory, and then recall Os.chdir (dirname) at the end of the function Change the current working directory back to the original value, as follows:
def somefunc (ARG, dirname, files):
Origdir = OS.GETCWD (); Os.chdir (dirname)
Os.chdir (Origdir)
Os.path.walk (Root, SomeFunc, Arg)
Of course, if you prefer, you can write code with similar functionality instead of Os.path.walk. The following code, which will invoke custom functions for each file rather than for each directory, is as follows:
def find (func, RootDir, Arg=none):
# call Func for each file in the RootDir directory
File = Os.listdir (rootdir) # get all the files in the RootDir directory
Files.sort (Lambda A, b:cmp (A.lower (), B.lower ()))
For file in Files:
FullPath = Os.path.join (rootdir, file)
If Os.path.islink (FullPath):
Pass
Elif Os.path.isdir (FullPath):
Find (func, FullPath, Arg)
Elif Os.path.isfile (FullPath):
Func (FullPath, Arg)
Else
print ' Find:cannot treat ', FullPath
The above function find can be obtained from the Scitools module. In contrast to the built-in function os.path.walk, our find function accesses files and directories in a case-sensitive alphabetical order.
We can use the Find function to list all files larger than 1Mb:
def checksize2 (FullPath, bigfiles):
Size = Os.path.getsize (FullPath)
If size > 1000000:
Bigfiles.append ('%.2fmb%s '% (size/1000000.0, FullPath))
Bigfiles = []
root = os.environ[' home ']
Find (Checksize2, root, Bigfiles)
For FileInfo in Bigfiles:
Print FileInfo
Parameter Arg provides a great deal of flexibility. We can use it to store both the input data and the generated data structure. The next example collects the file name and size of all files with a specified extension that are larger than a certain size. The results of the output are arranged according to the file size.
Bigfiles = {' FileList ': [], # file name and size list
' Extensions ': ('. *ps ', '. Tiff ', '. bmp '),
' Size_limit ': 1000000, # 1 Mb
}
Find (Checksize3, os.environ[' home '], bigfiles)
def checksize3 (FullPath, ARG):
Treat_file = False
ext = Os.path.splitext (FullPath) [1]
Import Fnmatch # Unix shell-style wildcard matching
For s in arg[' extensions ']:
If Fnmatch.fnmatch (ext, s):
Treat_file = true # FullPath with the correct extension
Size = Os.path.getsize (FullPath)
If treat_file and size > arg[' size_limit ']:
size = '%.2FMB '% (size/1000000.0) # Print
arg[' FileList '].append ({' Size ': size, ' name ': FullPath})
# Arrange files by size
Def Filesort (A, B):
Return CMP (Float (a[' size '][:-2]), float (b[' size '][:-2])
bigfiles[' filelist '].sort (filesort)
bigfiles[' FileList '].reverse ()
For FileInfo in bigfiles[' FileList ']:
Print fileinfo[' name ', fileinfo[' size ']
Note For functions sorted by list, each element in the bigfiles[' filelist ' function is a dictionary, the key size holds a string, but we must remove the unit MB (last two characters) and convert it to a floating-point number before making a comparison.
Viii. Summary
For file and directory processing, although it can be done through operating system commands, the Python language provides a number of built-in functions for working with files and directories in order for developers to handle related work in a programmatic way. Importantly, these functions are used exactly the same way on UNIX, Windows, or Macintosh platforms. This article explains in detail how these functions are used, in which we first introduce the functionality to display the contents of a directory, and then describe how to test whether a file name corresponds to a standard file, directory, or link, and how to extract the file size and date. After that, we'll also explain how to delete files and directories, how to copy and delete files, and how to break a complete file path into a directory section and file name section, and finally, we'll talk about the creation of directories and how to move directories and process files in the directory tree.