python--file Processing

Last Update:2018-06-18 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

1. File processing

=open(file="file01.txt", mode="r", encoding="utf-8"#python3默认编码格式为utf-8= f.read()print(data)print(type# <class ‘str‘>f.close()

If an error

#UnicodeDecodeError: ‘utf-8‘ codec can‘t decode byte 0xcc in position 0: invalid continuation byte

The description is not coded.

According to the normal logic, the file is stored in what way, it should be used in what way to read, such as gb2312, should be gb2312 to read. Files are stored in Utf-8, and are read in UTF-8 encoded format.

2. File processing-binary mode

Files are stored binary in the computer, we can ignore the code, directly read the contents of the file in binary form

=open(file="file01.txt", mode="rb"= f.read()print(data)f.close()

Read in binary mode, and the resulting mode = "rb" result is also binary.

When reading video, pictures and other content and network transmission, it will be read in binary mode.

#打印结果 b‘\xef\xbb\xbf\xe7\x94\xb0\xe7\xbb\xb4\xe9\x80\x9a\t12001\t10\r\n\xe5\xbc\xa0\xe5\xae\xb6\xe9\x93\xad\t12002\t11\r\n\xe8\x88\x92\xe5\xa8\x85\t12003\t12\r\n\xe5\xad\x99\xe7\x8e\x89\xe5\x80\xa9\t12004\t13\r\n\xe5\xbc\xa0\xe8\xb6\x85\t12005\t14\r\n\xe7\x8e\x8b\xe4\xba\xac\t12006\t15\r\n\xe5\xbb\x96\xe6\x9e\x97\xe8\x8b\xb1\t12007\t16\r\n\xe5\xbe\x90\xe6\x99\x93\xe8\x8e\x89\t12008\t17\r\n\xe9\x87\x91\xe5\x98\x89\xe7\xa5\xba\t12009\t18\r\n\xe5\x8f\x8a\xe6\xa0\xbc\t12010\t19\r\n\xe4\xba\x8e\xe5\x87\xaf\xe9\x98\xb3\t12011\t20\r\n\xe6\x9d\x8e\xe4\xbf\x8a\xe7\xba\xa2\t12012\t21\r\n\xe5\x88\x98\xe5\x86\xac\t12013\t22\r\n‘

3. File processing-Tools for intelligent detection coding

import=open(‘file01.txt‘, mode="rb"= f.read()print(chardet.detect(data))#打印结果：#{‘encoding‘: ‘UTF-8-SIG‘, ‘confidence‘: 1.0, ‘language‘: ‘‘}#confidence,表示 encoding 为 UTF-8-SIG 的概率为 1.0

Then, when we know what format the target file is, we data.decode("utf-8") can print out what we need.

4. Loop read-by-read file

=open(‘file01.txt‘‘r‘, encoding=‘utf-8‘)forin f:    print(line, end=""# end = "" 表示，打印的时候以什么结尾，此处可以去掉print默认的换行符 \nf.close()

Printing results:

# 田维通   12001   10# 张家铭   12002   11# 舒娅    12003   12# 孙玉倩   12004   13# 张超    12005   14# 王京    12006   15# 廖林英   12007   16# 徐晓莉   12008   17 ...

5. Writing files

#以 gbk 格式创建一个文件，写入内容“将进酒”=open(file=‘file02.txt‘, mode=‘w‘, encoding=‘gbk‘)f.write(‘将进酒‘)f.close()

If you write it again this time,

=open(file=‘file02.txt‘, mode=‘w‘, encoding=‘gbk‘) f.write(‘杯莫停‘)f.close()

The result is that the original File02.txt file has been overwritten

6. Write file--append

Write content appended to the existing content after

=open(‘file02.txt‘‘ab‘)  #mode 为 ab 或 a,表示追加f.write(‘\n人生得意须尽欢‘.encode(‘gbk‘))f.close()

7. File processing-read and write mixed operation files

=open(‘file02.txt‘‘r+‘, encoding=‘gbk‘= f.read()print("content：", data)f.write("\n锄禾日当午")f.write("\n汗滴禾下土")f.write("\n离离原上草")f.write("\n一岁一枯荣")f.close()

Results:

8. Other functions of the file operation (1) flush ()

=open(‘f_flush.txt‘‘w‘, encoding=‘utf-8‘)f.write(‘奇门遁甲‘# 在f.close() 之前，写入的内容是在内存中的，而且可能此时txt文件里是没有内容的，所以可以加一句 f.flush(),把文件强制从内存buffer里刷新到硬盘#一般内存里的buffer满了会自动刷新到硬盘，但是使用 f.flush() 可控制强制刷新到硬盘f.close()

(2) Tell () Seek ()

# 文本内容： hello world!>>>=open(‘file03.txt‘‘r‘, encoding=‘gbk‘)>>>#返回当前文件操作光标位置 0>>> f.seek(1# 把操作文件的光标移到指定位置1>>> f.read()‘ello world!‘

Note: Tell (), Seek () is a byte, and the length is calculated in bytes. In addition, each character in different encoding format accounted for the byte length is not the same, GBK a Chinese account for 2 bytes, utf-8 a Chinese 3 bytes

Take Chinese as an example:

# File contents: outmanoeuvred>>>F= Open("File03.txt",' R ', encoding=' GBK ')>>>F.read ()' outmanoeuvred '>>>F.tell ()8>>>F.seek (0)#把文件光标移动到起点 00>>>F.seek (4)#把文件光标移动到 4, at this time, GBK, a Chinese character accounted for 2 bytes, at this time the position of the cursor between the technical high and a chip4>>>F.read ()# So, read the result for the latter two words' a chip '>>>#----------------------------------------------------------------------------->>>F.seek (1)#把文件光标移动到 1, "Half a word", at this time read the content is not a problem, because he only got a part of the technical word byte, print will error1>>>F.tell ()1>>>F.read () Traceback (most recent): File"<stdin>", line1,inch <Module>Unicodedecodeerror:' GBK 'Codec can' t decode byte 0xef in position 6:incompletemultibyte sequence

(3) seekable () determine whether the file can be readable () to determine whether the file is readable, writable () to determine whether the file is writable (4) Truncate () truncates files by a specified length

#文件内容: 技高一筹>>>=open("file03.txt",‘r+‘,encoding=‘gbk‘)>>> f.seek(2)2>>> f.truncate()2>>>#文件内容 只剩一个 “技” 字

Truncate (4) Specifies the length, from the beginning of the file to truncate the specified length, without specifying the length, the contents from the current position to the end of the file is completely removed.

#文件内容: 技高一筹>>>=open("file03.txt",‘r+‘,encoding=‘gbk‘)>>> f.tell()0>>> f.truncate(4)  #文件内容 只剩一个 “技高” 字

9. Make changes to the contents of the file

ImportOsf= Open(' File05.txt ',' r+ ', encoding=' GBK ')#打开file05. txt filesF_new= Open(' File05_new.txt ',' W ', encoding=' GBK ')# Create a new fileOld_str= ' John Doe 'New_str= ' Li Yunlong ' forLineinchF:ifOld_strinchLine#逐行读取Line=Line.replace (Old_str, NEW_STR)#修改文件中的内容F_new.write (line)# Read-by-line content is written to the newly created fileF.close () F_new.close () Os.replace (' File05_new.txt ',' File05.txt ')# Replace, in order to achieve the purpose of modifying the contents of the File (window with Os.replace () can be implemented, but using os.rename () can not, will be error, prompted file05.txt already exist, cannot be created)

python--file Processing

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More