How to read and write text files? In actual cases, the encoding format of a text file has been straight (such as UTF-8, GBK, BIG5), how to read these files in python2x and python3x respectively? Solution: differentiate how to read and write text files?
Actual case
A text file encoding format has been straight (such as UTF-8, GBK, BIG5), respectively in python2.x and python3.x how to read these files?
Solution
Distinguish between python2 and python3.
The semantics of the string has changed:
Python2 |
Python3 |
Str |
Bytes |
Unicode |
Str |
Python2.x is unicode encoded before being written to a file. after reading the file, the binary string is decoded.
>>> F = open('py2.txt ', 'w') >>> s = u'' >>> f. write (s. encode ('gbk') >>> f. close () >>> f = open('py2.txt ', 'r') >>> t = f. read () >>> print t. decode ('gbk ')
Hi!
In python3.x, the open function specifies the text mode of t, and encoding specifies the encoding format.
>>> F = open('py3.txt ', 'WT', encoding = 'utf-8') >>> f. write (' ') 2 >>> f. close () >>> f = open('py3.txt ', 'RT', encoding = 'utf-8') >>> s = f. read () >>> s 'Hello'
How to set file Buffering
Actual case
When writing file content to a hard disk device, you can use the system call. This type of I/O operation takes a long time. to reduce the number of I/O operations, files usually use a buffer (with enough data for system calling). The Cache behavior of files can be divided into full buffering, row caching, and no buffering.
How to set the buffer for file objects in Python?
Solution
Full Buffer: the buffering of the open function is set to an integer n greater than 1, and n is the buffer size.
>>> F = open('demo2.txt ', 'W', buffering = 2048) >>> f. write ('+' * 1024) >>> f. write ('+' * 1023) # write a file when it is greater than 2048> f. write ('-' * 2) >>> f. close ()
Row buffer: set buffering of the open function to 1.
>>> F = open('demo3.txt ', 'W', buffering = 1) >>> f. write ('ABC') >>> f. write ('20140901') # write to the file as long as \ n is added >>> f. write ('\ n') >>> f. close ()
No buffer: set buffering of the open function to 0.
>>> f = open('demo4.txt', 'w', buffering=0)>>> f.write('a')>>> f.write('b')>>> f.close()
How to map files to memory?
Actual case
When accessing some binary files, you can map the files to the memory for random access. (framebuffer device files)
Some embedded devices include registers in the memory address space. we can map/dev/mem to access these registers.
If multiple processes are mapped to the same file, process communication can also be achieved.
Solution
Using the mmap () function of the mmap module in the standard library, it requires an open file descriptor as a parameter
Create the following file
Root@pythontab.com ~ # Dd if =/dev/zero of = demo. bin bs = 1024 count = 10241024 + 0 records in1024 + 0 records out1048576 bytes (1.0 MB) copied, 0.00380084 s, 276 MB/s # View file content in hexadecimal format [root@pythontab.com ~] # Od-x demo. bin 0000000 0000 0000 0000 0000 0000 0000 0000 0000*4000000
>>> Import mmap >>> import OS >>> f = open ('demo. bin', 'R + B ') # obtain the file descriptor> f. fileno () 3 >>> m = mmap. mmap (f. fileno (), 0, access = mmap. ACCESS_WRITE) >>> type (m)
# You can obtain content through indexes> m [0] '\ x00'> m [] '\ x00 \ x00 \ x00 \ x00 \ x00 \ x00 \ x00 \ x00 \ x00 \ x00 \ x00' # modify content> m [0] = '\ x88'
View
[root@pythontab.com ~]# od -x demo.bin 0000000 0088 0000 0000 0000 0000 0000 0000 00000000020 0000 0000 0000 0000 0000 0000 0000 0000*4000000
Modify slice
>>> m[4:8] = '\xff' * 4
View
[root@pythontab.com ~]# od -x demo.bin 0000000 0088 0000 ffff ffff 0000 0000 0000 00000000020 0000 0000 0000 0000 0000 0000 0000 0000*4000000
>>> m = mmap.mmap(f.fileno(),mmap.PAGESIZE * 8,access=mmap.ACCESS_WRITE,offset=mmap.PAGESIZE * 4) >>> m[:0x1000] = '\xaa' * 0x1000
View
[root@pythontab.com ~]# od -x demo.bin 0000000 0088 0000 ffff ffff 0000 0000 0000 00000000020 0000 0000 0000 0000 0000 0000 0000 0000*0040000 aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa*0050000 0000 0000 0000 0000 0000 0000 0000 0000*4000000
How to access the file status?
Actual case
In some projects, we need to obtain the file status, for example:
File type (common file, directory, symbolic link, device file ...)
File access permission
Last file access/modification/node status change time
Size of common files
.....
Solution
The current directory contains the following files:
[root@pythontab.com 2017]# lltotal 4drwxr-xr-x 2 root root 4096 Sep 16 11:35 dirs-rw-r--r-- 1 root root 0 Sep 16 11:35 fileslrwxrwxrwx 1 root root 37 Sep 16 11:36 lockfile -> /tmp/qtsingleapp-aegisG-46d2-lockfile
System call
The three systems in the OS module in the standard library call stat, fstat, and lstat to obtain the file status.
>>> Import OS >>> s = OS. stat ('Files') >>> sposix. stat_result (st_mode = 33188, st_ino = 267646, st_dev = 51713L, st_nlink = 1, st_uid = 0, st_gid = 0, st_size = 0, st_atime = 1486197100, st_mtime = 1486197100, st_ctime = 1486197100) >>> s. st_mode33188 >>> import stat # stat has many S_IS .. method to determine the file type> stat. s_ISDIR (s. st_mode) False # common file >>> stat. s_ISREG (s. st_mode) True
Obtain the object access permission. if it is greater than 0, it is true.
>>> s.st_mode & stat.S_IRUSR256>>> s.st_mode & stat.S_IXGRP0>>> s.st_mode & stat.S_IXOTH0
Get the file modification time
# Access time> s. st_atime1486197100.3384446 # Modification time> s. st_mtime1486197100.3384446 # Status Update Time> s. st_ctime1486197100.3384446
Convert the obtained timestamp
>>> import time>>> time.localtime(s.st_atime)time.struct_time(tm_year=2016, tm_mon=9, tm_mday=16, tm_hour=11, tm_min=35, tm_sec=47, tm_wday=4, tm_yday=260, tm_isdst=0)
Get the size of a common file
>>> s.st_size0
Shortcut functions
Some functions under OS. path in the standard library are more concise to use.
File type determination
>>> os.path.isdir('dirs') True>>> os.path.islink('lockfile')True>>> os.path.isfile('files') True
File time
>>> os.path.getatime('files')1486197100.3384445>>> os.path.getmtime('files')1486197100.3384445>>> os.path.getctime('files')1486197100.3384445
Get file size
>>> os.path.getsize('files') 0
How to use temporary files?
Actual case
In a project, we collect data from sensors. after each 1 GB of data is collected, we perform data analysis and only save the analysis results. if a large amount of temporary data is stored in the memory, this will consume a lot of memory resources. we can use temporary files to store these temporary data (external storage)
Temporary files do not need to be named, and will be deleted automatically after being closed
Solution
Use the TemporaryFile under the tempfile in the standard library, NamedTemporaryFile
>>> From tempfile import TemporaryFile, NamedTemporaryFile # The object f can only be used for access >>> f = TemporaryFile () >>> f. write ('abcdef '* 100000) # access temporary data> f. seek (0) >>> f. read (100) 'delete' >>> ntf = NamedTemporaryFile () # If you want to keep the file from being deleted every time you create a NamedTemporaryFile () object, you can set NamedTemporaryFile (delete = False) >>> ntf. name # return the path of the current temporary file in the file system '/tmp/tmppnvna6'
The above is a detailed description of how to use the file I/O efficient operations and processing techniques in Python. For more information, see other related articles in the first PHP community!