Tips on using Python for efficient file I/O operations

Last Update:2017-05-14 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

How to read and write text files? In actual cases, the encoding format of a text file has been straight (such as UTF-8, GBK, BIG5), how to read these files in python2x and python3x respectively? Solution: differentiate how to read and write text files?

Actual case

A text file encoding format has been straight (such as UTF-8, GBK, BIG5), respectively in python2.x and python3.x how to read these files?

Solution

Distinguish between python2 and python3.

The semantics of the string has changed:

Python2	Python3
Str	Bytes
Unicode	Str

Python2.x is unicode encoded before being written to a file. after reading the file, the binary string is decoded.

>>> F = open('py2.txt ', 'w') >>> s = u'' >>> f. write (s. encode ('gbk') >>> f. close () >>> f = open('py2.txt ', 'r') >>> t = f. read () >>> print t. decode ('gbk ')

Hi!

In python3.x, the open function specifies the text mode of t, and encoding specifies the encoding format.

>>> F = open('py3.txt ', 'WT', encoding = 'utf-8') >>> f. write (' ') 2 >>> f. close () >>> f = open('py3.txt ', 'RT', encoding = 'utf-8') >>> s = f. read () >>> s 'Hello'

How to set file Buffering

Actual case

When writing file content to a hard disk device, you can use the system call. This type of I/O operation takes a long time. to reduce the number of I/O operations, files usually use a buffer (with enough data for system calling). The Cache behavior of files can be divided into full buffering, row caching, and no buffering.

How to set the buffer for file objects in Python?

Solution

Full Buffer: the buffering of the open function is set to an integer n greater than 1, and n is the buffer size.

>>> F = open('demo2.txt ', 'W', buffering = 2048) >>> f. write ('+' * 1024) >>> f. write ('+' * 1023) # write a file when it is greater than 2048> f. write ('-' * 2) >>> f. close ()

Row buffer: set buffering of the open function to 1.

>>> F = open('demo3.txt ', 'W', buffering = 1) >>> f. write ('ABC') >>> f. write ('20140901') # write to the file as long as \ n is added >>> f. write ('\ n') >>> f. close ()

No buffer: set buffering of the open function to 0.

>>> f = open('demo4.txt', 'w', buffering=0)>>> f.write('a')>>> f.write('b')>>> f.close()

How to map files to memory?

Actual case

When accessing some binary files, you can map the files to the memory for random access. (framebuffer device files)

Some embedded devices include registers in the memory address space. we can map/dev/mem to access these registers.

If multiple processes are mapped to the same file, process communication can also be achieved.

Solution

Using the mmap () function of the mmap module in the standard library, it requires an open file descriptor as a parameter

Create the following file

Root@pythontab.com ~ # Dd if =/dev/zero of = demo. bin bs = 1024 count = 10241024 + 0 records in1024 + 0 records out1048576 bytes (1.0 MB) copied, 0.00380084 s, 276 MB/s # View file content in hexadecimal format [root@pythontab.com ~] # Od-x demo. bin 0000000 0000 0000 0000 0000 0000 0000 0000 0000*4000000

>>> Import mmap >>> import OS >>> f = open ('demo. bin', 'R + B ') # obtain the file descriptor> f. fileno () 3 >>> m = mmap. mmap (f. fileno (), 0, access = mmap. ACCESS_WRITE) >>> type (m)
 
  
# You can obtain content through indexes> m [0] '\ x00'> m [] '\ x00 \ x00 \ x00 \ x00 \ x00 \ x00 \ x00 \ x00 \ x00 \ x00 \ x00' # modify content> m [0] = '\ x88'

View

[root@pythontab.com ~]# od -x demo.bin 0000000 0088 0000 0000 0000 0000 0000 0000 00000000020 0000 0000 0000 0000 0000 0000 0000 0000*4000000

Modify slice

>>> m[4:8] = '\xff' * 4

View

[root@pythontab.com ~]# od -x demo.bin 0000000 0088 0000 ffff ffff 0000 0000 0000 00000000020 0000 0000 0000 0000 0000 0000 0000 0000*4000000

>>> m = mmap.mmap(f.fileno(),mmap.PAGESIZE * 8,access=mmap.ACCESS_WRITE,offset=mmap.PAGESIZE * 4) >>> m[:0x1000] = '\xaa' * 0x1000

View

[root@pythontab.com ~]# od -x demo.bin 0000000 0088 0000 ffff ffff 0000 0000 0000 00000000020 0000 0000 0000 0000 0000 0000 0000 0000*0040000 aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa*0050000 0000 0000 0000 0000 0000 0000 0000 0000*4000000

How to access the file status?

Actual case

In some projects, we need to obtain the file status, for example:

File type (common file, directory, symbolic link, device file ...)

File access permission

Last file access/modification/node status change time

Size of common files

.....

Solution

The current directory contains the following files:

[root@pythontab.com 2017]# lltotal 4drwxr-xr-x 2 root root 4096 Sep 16 11:35 dirs-rw-r--r-- 1 root root 0 Sep 16 11:35 fileslrwxrwxrwx 1 root root 37 Sep 16 11:36 lockfile -> /tmp/qtsingleapp-aegisG-46d2-lockfile

System call

The three systems in the OS module in the standard library call stat, fstat, and lstat to obtain the file status.

>>> Import OS >>> s = OS. stat ('Files') >>> sposix. stat_result (st_mode = 33188, st_ino = 267646, st_dev = 51713L, st_nlink = 1, st_uid = 0, st_gid = 0, st_size = 0, st_atime = 1486197100, st_mtime = 1486197100, st_ctime = 1486197100) >>> s. st_mode33188 >>> import stat # stat has many S_IS .. method to determine the file type> stat. s_ISDIR (s. st_mode) False # common file >>> stat. s_ISREG (s. st_mode) True

Obtain the object access permission. if it is greater than 0, it is true.

>>> s.st_mode & stat.S_IRUSR256>>> s.st_mode & stat.S_IXGRP0>>> s.st_mode & stat.S_IXOTH0

Get the file modification time

# Access time> s. st_atime1486197100.3384446 # Modification time> s. st_mtime1486197100.3384446 # Status Update Time> s. st_ctime1486197100.3384446

Convert the obtained timestamp

>>> import time>>> time.localtime(s.st_atime)time.struct_time(tm_year=2016, tm_mon=9, tm_mday=16, tm_hour=11, tm_min=35, tm_sec=47, tm_wday=4, tm_yday=260, tm_isdst=0)

Get the size of a common file

>>> s.st_size0

Shortcut functions

Some functions under OS. path in the standard library are more concise to use.

File type determination

>>> os.path.isdir('dirs') True>>> os.path.islink('lockfile')True>>> os.path.isfile('files') True

File time

>>> os.path.getatime('files')1486197100.3384445>>> os.path.getmtime('files')1486197100.3384445>>> os.path.getctime('files')1486197100.3384445

Get file size

>>> os.path.getsize('files') 0

How to use temporary files?

Actual case

In a project, we collect data from sensors. after each 1 GB of data is collected, we perform data analysis and only save the analysis results. if a large amount of temporary data is stored in the memory, this will consume a lot of memory resources. we can use temporary files to store these temporary data (external storage)

Temporary files do not need to be named, and will be deleted automatically after being closed

Solution

Use the TemporaryFile under the tempfile in the standard library, NamedTemporaryFile

>>> From tempfile import TemporaryFile, NamedTemporaryFile # The object f can only be used for access >>> f = TemporaryFile () >>> f. write ('abcdef '* 100000) # access temporary data> f. seek (0) >>> f. read (100) 'delete' >>> ntf = NamedTemporaryFile () # If you want to keep the file from being deleted every time you create a NamedTemporaryFile () object, you can set NamedTemporaryFile (delete = False) >>> ntf. name # return the path of the current temporary file in the file system '/tmp/tmppnvna6'

The above is a detailed description of how to use the file I/O efficient operations and processing techniques in Python. For more information, see other related articles in the first PHP community!

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Tips on using Python for efficient file I/O operations

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Tips on using Python for efficient file I/O operations

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support