Python Learning-File manipulation

Source: Internet
Author: User
Tags posix truncated

1. Documentation Basics

1. A file is a collection of data stored on external media, the base unit of the file is a byte, and the file contains the number of bytes that are the length of the file. Each byte has a default location, starting at 0, the location of the file header is 0, and the end of the file is the last location after the file content, where there is no file content and is empty. The read and write operation of the file starts at the location where the file pointer is located, which reads from the location where the file pointer is located, and the write begins at the location where the file pointer is located, and is overwritten if there is content.
2. According to the form of the data in the document, the file is divided into two types: text file and binary file. A text file stores a regular string, consisting of lines of text, usually ending with a newline character ' \ n ', and can read and write only regular strings. Text files can be viewed and edited using word processing software such as Gedit, Notepad, and so on. Regular strings are strings that the text Editor can display and edit normally, such as English alphabet strings, Chinese characters strings, and string numbers. A binary file stores the contents of an object in memory in the form of a byte string (bytes). cannot be edited with word processing software.

2. Opening or creating a file
格式:文件变量名 = open(文件名[,打开方式[,缓冲区]])

filename Specifies the file object to be opened
Open mode specifies how the file can be processed after it is opened
Buffers develop a cache mode for reading and writing files. 0 means no cache, 1 means cache, for example, Dayu 1 indicates the size of the buffer. The default value is 1
The open () function returns a file object that allows you to perform various operations on the file

1. Plain text files

r:    - 只能读,不能写    - 读取的文件不存在, 报错     - 报错信息:FileNotFoundError: [Errno 2] No such file or directory:xxxxxxr+:    - 可以执行读写操作    - 文件不存在,报错    - 默认情况下,从文件指针所在位置开始写入w:    - 只能写,不能读    - 文件不存在,不报错,自动创建文件并打开    - 会清空文件内容w+:    - 可以执行读写操作    - 文件不存在,不报错,自动创建文件并打开    - 会清空文件内容a:    - 只能写,不能读    - 文件不存在,不报错,自动创建文件并打开    - 不会清空文件内容a+:    - 可以执行读写操作    - 文件不存在,不报错,自动创建文件并打开    - 不会清空文件内容

2. binary file: Open mode add ' B ' to normal text file

rb:    - 只能读,不能写    - 读取的文件不存在,报错rb+:    - 可以执行读写操作    - 文件不存在,报错    - 默认情况下,从文件指针所在位置开始写入wb:    - 只能写,不能读    - 文件不存在,不报错,自动创建文件并打开    - 会清空文件内容wb+:    - 可以执行读写操作    - 文件不存在,不报错,自动创建文件并打开    - 会清空文件内容ab:    - 只能写,不能读    - 文件不存在,不报错,自动创建文件并打开    - 不会清空文件内容ab+:    - 可以执行读写操作    - 文件不存在,不报错,自动创建文件并打开    - 不会清空文件内容
3. File Common methods
First, read F.next () #在文件使用迭代器时会使用到, in the loop, the next () method is called in each loop, the method returns the next line of the file, and if it reaches the end (EOF), the Stopiterationf.read is triggered ([size]) # The method is used to read the specified number of bytes from the file, and all if not given or negative. F.readline ([size]) #从文件读取整行, including the "\ n" character. If a non-negative parameter is specified, the number of bytes of the specified size is returned, including the "\ n" character. F.readlines ([size]) #读取所有行 (until the end of EOF) and returns the list, which can be used by Python for ... in ... Structure for processing. Returns an empty string if the Terminator EOF is encountered. , if given sizeint>0, is set to read how many bytes, this is to reduce the reading pressure two, write f.write ([str]) #用于向文件中写入指定字符串f. Writelines (sequence_of_strings) # Writes a sequence of strings to a file three, other operations F.flush () #用来刷新缓冲区的, the data in the buffer is immediately written to the file, while emptying the buffer, does not need to be passive waiting for the output buffer to write F.seek (offset[, whence]) #于移动文件读取指 Pin to the specified position offset--the starting offset, which is the number of bytes required to move the offset whence--optional, the default value is 0. Give the offset parameter a definition of where to start the offset, and 0 to start at the beginning of the file, 1 to start at the beginning of the current bit, and 2 to count from the end of the file. F.tell () #返回文件指针当前位置f. truncate ([size]) #截断文件, if an optional parameter of size is specified, the truncated file is a size character. If size is not specified, it is truncated from the current position, and all characters after the size are deleted after truncation. F.close () #用于关闭一个已打开的文件 f.closed #返回true如果文件已被关闭, otherwise returns false. F.fileno () #方法返回一个整型的文件描述符 (file descriptor FD Integer), which can be used for I/O operations of the underlying operating system. F.isatty () #检测文件是否连接到一个终端设备,If it is true, return false four, the following two properties are in Python2, and the F.mode #返回被打开文件的访问模式 has been deleted in Python3. F.name # Returns the name of the file.
4. Security context with

Context Manager: Open a file and automatically close the file object after executing the contents of the WITH statement

 with open(‘/tmp/passwd‘) as f:     print("with语句里面:", f.closed)     print(f.read()) print("after with语句:", f.closed)

Open two file objects at the same time (not supported in Python2)

with    open(‘/tmp/passwd‘) as f1, open(‘/tmp/passwdBack‘, ‘w+‘) as f2:    # 将第一个文件的内容写入第二个文件中,文件复制即使如此。    f2.write(f1.read())    # 移动指针移动到文件最开始    f2.seek(0,0)    # 读取指针内容    print(f2.read())

This is the only way to Python2.

with open(‘/tmp/passwd‘) as f1:    content = f1.read()with open(‘/tmp/passwdBack‘, ‘w+‘):    f2.write(content)
5.yield implementation reads large files
# 1. 文件操作   1). 创建文件data.txt, 文件共100000行, 每行存放一个1~100之间的整数.   2). 找出文件中数字出现次数最多的10个数字, 写入文件mostNum.txt; import random with open(‘data.txt‘, mode=‘a+‘) as f:     for i in range(1000000):         f.write(str(random.randint(1,100))+‘\n‘)# 通过yield, 每次读取一行进行处理 def byLineReader(filename):     with open(filename) as f:         line = f.readline()         # 如果可以读取到内容, 返回该行信息         while line:             yield  line             line = f.readline()# read是一个生成器对象, read = byLineReader(‘data.txt‘) print(read)# #1). next 读取生成器的内容 print(next(read)) print(next(read)) ...# #2). 通过for循环 for item in read:     print(item)# ******** 文件对象是可以for循环遍历的, 默认遍历的内容为每一行的内容.是节省内存空间的。from collections import Iterablef = open(‘data.txt‘)print(isinstance(f, Iterable))for i, item in enumerate(f):    if i == 10:        break    print(i, item)
6.os Module

The OS module provides a very rich method for working with files and directories.

import os# 1). 返回操作系统类型, 值为posix,是Linux操作系统, 值为nt, 是windows操作系统print(os.name)print(‘Linux‘ if os.name==‘posix‘ else ‘Windows‘)# 2). 操作系统的详细信息info = os.uname()print(info)print(info.sysname)print(info.nodename)# 3). 系统环境变量print(os.environ)# 4). 通过key值获取环境变量对应的value值print(os.environ.get(‘PATH‘))print(os.getenv(‘PATH‘))

Iterate through all content in the specified directory

import osfrom os.path import joinfor root, dirs, files in os.walk(‘/var/log‘):    #print(root, dirs, files)    for name in files:        print(join(root, name))

OS module Common methods Daquan:

Os.access (path, mode) verifies that the permission mode Os.chdir (path) changes the current working directory Os.chflags (path, flags) to set the path marked as a number tag. Os.chmod (path, mode) Change permissions os.chown (path, UID, GID) Change file owner Os.chroot (path) to change the root of the current process Os.close (FD) Close file descriptor Fdos.closerange (Fd_low, Fd_high) closes all file descriptors, from Fd_low (contains) to Fd_high (not included), Error ignores os.dup (FD) Copy file descriptor Fdos.dup2 (FD, FD2) Copying one file descriptor fd to another fd2os.fchdir (FD) changes the current working directory through the file descriptor Os.fchmod (FD, mode) to change the access rights of a file, which is specified by the parameter fd, the parameter mode is the file access permission under UNIX. Os.fchown (FD, UID, GID) modifies the ownership of a file, which modifies the user ID and user group ID of a file, specified by the file descriptor FD. Os.fdatasync (FD) forces the file to be written to disk, which is specified by the file descriptor FD, but does not force the update of the file's state information. Os.fdopen (fd[, mode[, BufSize]) creates a file object with the file descriptor FD and returns the file Object Os.fpathconf (FD, name) returns the system configuration information for an open file. Name is the value of the system configuration retrieved, which may be a string that defines system values, which are specified in many criteria (posix.1, UNIX, UNIX 98, and others). Os.fstat (FD) Returns the state of the file descriptor FD, like stat (). OS.FSTATVFS (FD) returns information about the file system that contains file descriptor FD files, such as STATVFS () Os.fsync (FD) forcing files written to the file descriptor to be FD to the hard disk. Os.ftruncate (fd, length) crops file descriptor fd corresponding file, so it cannot exceed the file size maximum. OS.GETCWD () returns the current working directory Os.getcwdu () returns a Unicode object for the current working directory Os.isatty (FD) returns True if the file descriptor FD is open and is connected to the TTY (-like) device at the same time. otherwise false. Os.lchflAGS (path, flags) sets the path marked as a number tag, similar to Chflags (), but does not have a soft link os.lchmod (path, mode) to modify the connection file permissions Os.lchown (path, UID, GID) to change the file owner, similar Chown, but does not track links. Os.link (SRC, DST) creates a hard link, named DST, that points to parameter Srcos.listdir (path) that returns a list of the names of files or folders that are contained in the folder specified by path. Os.lseek (FD, POS, how) Set file descriptor fd Current position for POS, how to modify: Seek_set or 0 sets the computed POS starting from the file; Seek_cur or 1 is calculated from the current position; Os. Seek_end or 2 starts at the end of the file. In Unix,windows, Os.lstat (path) is active like stat (), but there is no soft link os.major (device) that extracts the unit major number from the original device number (using St_dev or St_rdev field in stat). Os.makedev (major, minor) takes the major and minor device numbers to form an original device number Os.makedirs (path[, mode]) recursively creates a function for the folder. Like MkDir (), but all Intermediate-level folders that are created need to contain subfolders. Os.minor extracts the device minor number from the original device number (using St_dev in stat or St_rdev field). Os.mkdir (path[, mode]) creates a folder named path in the digital mode mode. The default mode is 0777 (octal). Os.mkfifo (path[, mode]) creates a named pipe, mode is a number, defaults to 0666 (octal) Os.mknod (filename[, mode=0600, device]) to create a file system node named filename (file, Device special file or name pipe). Os.open (file, flags[, mode]) opens a file and sets the desired open option, the mode parameter is optional os.openpty () to open a new pseudo-terminal pair. Returns the file descriptor for the Pty and TTY. os.pathconf (path, name) returns the system configuration information for the associated file. Os.pipe () ChongBuild a pipeline. Returns a pair of file descriptors (R, W) for Read and write Os.popen (command[, mode[, BufSize]) to open a pipeline from a command Os.read (FD, N) reads up to n bytes from the file descriptor fd, returns the byte containing the read bytes String, file descriptor fd The corresponding file has reached the end and an empty string is returned. Os.readlink (path) returns the file Os.remove (path) to which the soft link points to delete the file path. If path is a folder, it will throw oserror; View the following rmdir () to remove a directory. Os.removedirs (path) deletes the directory recursively. Os.rename (SRC, DST) renames a file or directory, renaming the directory recursively from SRC to dstos.renames (old, new), or renaming the file. Os.rmdir (path) deletes the empty directory specified by path and throws a OSError exception if the directory is not empty. Os.stat (Path) obtains information about the path specified by path, which is functionally equivalent to the stat () system call in the C API. Os.stat_float_times ([NewValue]) determines whether Stat_result displays a timestamp with a float object Os.statvfs (path) gets file system statistics for the specified path Os.symlink (src, DST) Creates a soft link os.tcgetpgrp (FD) that returns a process group associated with Terminal FD (an open file descriptor returned by Os.open ()) OS.TCSETPGRP (FD, PG) set with Terminal FD (one by Os.open () Returns an open file descriptor) of the associated process Group for PG. Os.tempnam ([dir[, prefix]]) was deleted in Python3. Returns a unique path name used to create a temporary file. Os.tmpfile () Python3 has been removed. Returns a file object with an open mode of (W+B). This file object has no folder entry, no file descriptor, and will be automatically deleted. Os.tmpnam () Python3 has been removed. Returns a unique path for creating a temporary file Os.ttyname (FD) returns a String that represents the end device associated with the file descriptor FD. If the FD is not associated with the end device, an exception is thrown. Os.unlink (path) Delete file path Os.utime (path, times) returns the specified path textThe time of the item's access and modification. Os.walk (top[, topdown=true[, onerror=none[, Followlinks=false]]) output the file name in the folder. By walking in the middle of a tree, up or down. Os.write (FD, str) writes the string to the file descriptor FD. Returns the string length actually written

From: http://www.runoob.com/python3/python3-os-file-methods.html

8.sys Module
import  sys# 返回一个列表, 第一个元素为当前文件名print(sys.argv)print(sys.argv[0])# 如果获取脚本传入的第n个参数, sys.argv[n]

Batch Change file name

 # Create directory img, randomly generate 100 files ending in. png in this directory, and then change the file ending with. png to end with. jpg Import os,random,string,sys# Create a directory and randomly generate a. png file Os.mkdir (' img ') for I in range: Os.mknod (' img/' + '. Join (Random.sample (string.ascii_letters+ string.digits,4) + '. png ') def modify_suffix (Dirname,old_suffix,new_suffix): If not os.path.exists (dirname): Print (' directory does not exist! Exit () Old_suffix_file_list = List (filter (lambda x:x.endswith (Old_suffix), Os.listdir (dirname))) #字符串方法: #new_suffix_file_list = [] # for I in Old_suffix_file_list: # new_suffix_file_list.append (I.replace (Old_suffix, New_suffix) # Print (new_suffix_file_list) # for i,j in Zip (old_suffix_file_list,new_suffix_file_list): # pri NT (DIRNAME+I,DIRNAME+J) # os.rename (dirname+ '/' +i,dirname+ '/' +j) #文件操作方法: file_name = [Os.path.splitext (name) [0] for name in old_suffix_file_list] for I in File_name:os.rename (dirname+ '/' +i+old_suffix,dirname+ '/' +i+new_s Uffix) modify_suffix (' img ', '. png ', '. jpg ') 

Python Learning-file operations

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.