Python file io

Source: Internet
Author: User
Tags readline

I. Io, synchronous IO, and asynchronous IO

Io is the input/output in the computer, that is, input and output. Because the program and runtime data resides in memory and is executed by the CPU, an ultra-fast compute core, where the data is exchanged, usually a disk, a network, and so on, requires an IO interface.

For example, you open a browser, visit the Sina homepage, the browser this program needs to get Sina's web page via network IO. The browser first sends the data to the Sina server, tells it I want the homepage HTML, this action is outward sends the data, is called the output, then the Sina server sends over the webpage, this action is receives the data from the outside, is called input. So, typically, the program completes IO operations with input and output two data streams. Of course there is only one case, for example, to read from disk to memory, only input operation, in turn, writes the data to the disk file, is just an output operation.

In IO programming, stream (stream) is a very important concept that can imagine a stream as a water pipe, and the data is a pipe, but it can only flow in one direction. Input Stream is the data that flows from the outside (disk, network) into memory, and the Output stream is the data flowing from the memory to the outside. For web browsing, the browser and Sina server need to establish at least two water pipes, before they can both send data and collect data.

Because the CPU and memory speed is much higher than the speed of the peripheral, so in IO programming, there is a serious problem of speed mismatch. For example, for example, to write 100M data to disk, the CPU output 100M of data only need 0.01 seconds, but the disk to receive this 100M data may take 10 seconds, how to do? There are two ways to do this:

The first is the CPU waiting, that is, the program suspends execution of subsequent code, such as 100M of data in 10 seconds after writing to the disk, and then down execution, this mode is called synchronous IO;

Another way is that the CPU does not wait, just tell the disk, "you old slowly write, do not worry, I went on to do something else," then, the subsequent code can be executed immediately, this mode is called asynchronous IO.

The difference between synchronous and asynchronous is whether to wait for the results of IO execution. Like you go to McDonald's order, you say "to a hamburger", waiter told you, sorry, hamburger to do now, need to wait 5 minutes, so you stand in front of the cashier waiting for 5 minutes, get hamburger and then go shopping mall, this is synchronous IO.

You say "Come a hamburger", the waiter tells you, Hamburg needs to wait 5 minutes, you can go to the mall, wait for the good, we will inform you, so you can immediately do other things (shopping mall), this is asynchronous IO.

Obviously, using asynchronous IO to write program performance is much higher than synchronous IO, but the disadvantage of asynchronous IO is the complexity of the programming model. Think about it, you need to know when to tell you that "burger is ready", and the method of informing you is different. If the waiter came to find you, this is the callback mode, if the waiter texted you, you have to keep checking the phone, this is the polling mode. In summary, the complexity of asynchronous IO is much higher than that of synchronous IO.

The ability to operate IO is provided by the operating system, and each programming language encapsulates the low-level C interface provided by the operating system for ease of use, and Python is no exception. We'll discuss Python's IO programming interface in detail later.

Second, open function

Syntax:file object = Open(file_name [, access_mode] [, buffering])

The details of each parameter are as follows:

    • The File_name:file_name variable is a string value that contains the name of the file you want to access.
    • Access_mode:access_mode determines the mode of opening the file: read-only, write, append, etc. All the desirable values are shown in the full list below. This parameter is non-mandatory and the default file access mode is read-only (R).
    • Buffering: If the value of buffering is set to 0, there is no deposit. If the value of buffering is 1, the row is stored when the file is accessed. If you set the value of buffering to an integer greater than 1, it indicates that this is the buffer size of the storage area. If a negative value is taken, the buffer size of the storage area is the system default.

Full list of open files in different modes:

Mode Description
R Open the file as read-only. The pointer to the file will be placed at the beginning of the file. This is the default mode.
Rb Opens a file in binary format for read-only. The file pointer will be placed at the beginning of the file. This is the default mode.
r+ Open a file for read-write. The file pointer will be placed at the beginning of the file.
rb+ Opens a file in binary format for read-write. The file pointer will be placed at the beginning of the file.
W Open a file for writing only. Overwrite the file if it already exists. If the file does not exist, create a new file.
Wb Open a file in binary format only for writing. Overwrite the file if it already exists. If the file does not exist, create a new file.
w+ Open a file for read-write. Overwrite the file if it already exists. If the file does not exist, create a new file.
wb+ Opens a file in binary format for read-write. Overwrite the file if it already exists. If the file does not exist, create a new file.
A Opens a file for appending. If the file already exists, the file pointer will be placed at the end of the file. In other words, the new content will be written to the existing content. If the file does not exist, create a new file to write to.
Ab Opens a file in binary format for appending. If the file already exists, the file pointer will be placed at the end of the file. In other words, the new content will be written to the existing content. If the file does not exist, create a new file to write to.
A + Open a file for read-write. If the file already exists, the file pointer will be placed at the end of the file. The file opens with an append mode. If the file does not exist, create a new file to read and write.
ab+ Opens a file in binary format for appending. If the file already exists, the file pointer will be placed at the end of the file. If the file does not exist, create a new file to read and write.

Character encoding:

To read non-UTF-8 encoded text files, you need to open() pass parameters to the function encoding and encounter some encoded files that you may encounter because of the UnicodeDecodeError possible inclusion of some illegal encoded characters in the text file. In this case, the open() function also receives a errors parameter that indicates what to do if a coding error is encountered. The simplest way is to ignore it directly.

1 f = open ('/users/michael/gbk.txt'r', encoding='gbk', errors='ignore')

Third, file object properties

name Description
Buffer <_io. BufferedWriter name= ' Test.txt ' >
Closed False
Encoding ' UTF-8 '
Errors ' Strict '
Line_buffering False
Mode ' W '
Name ' Test.txt '

Iv. File Object Methods

name Description
Close <function textiowrapper.close>
Detach <function textiowrapper.detach>
Fileno <function textiowrapper.fileno>
Flush <function textiowrapper.flush>
Isatty <function textiowrapper.isatty>
Read <function textiowrapper.read>
Readable <function textiowrapper.readable>
ReadLine <function textiowrapper.readline>
ReadLines <function textiowrapper.readlines>
Seek <function textiowrapper.seek>
Seekable <function textiowrapper.seekable>
Tell <function textiowrapper.tell>
Truncate <function textiowrapper.truncate>
Writable <function textiowrapper.writable>
Write <function textiowrapper.write>
Writelines <function textiowrapper.writelines>

V. Close method

1. Flush any information not yet written in the buffer, 2, close the file.

1 fd = open ("foo.txt""wb")2 Fd.close ()

With method eliminates close ()

1 with open ('/path/to/file'r' as  F:2     Pass

Vi. Read method

Syntax:fileobject. Read([Count])

The call read() reads the entire contents of the file one time, and if the file has 10G, the memory explodes, so, to be safe, you can call the read(size) method repeatedly, reading the contents of a size byte at most. In addition, the call readline() can read one line at a time, and the call readlines() reads all list the contents once and returns by row . Therefore, you need to decide how to call as needed.

If the file is small, one-time read() reading is the most convenient, if the file size can not be determined, repeated calls to read(size) compare insurance; If the configuration file, the call is readlines() most convenient.

Vii. Write method

Syntax:fileobject. Write(string)

You can write to write() the file repeatedly, but be sure to call f.close() to close the file. When we write a file, the operating system often does not immediately write the data to disk, but instead put it in memory cache, and then write slowly when idle. Only when the method is invoked close() does the operating system guarantee that all data that is not written is written to disk. The consequence of forgetting the call is that the close() data may have only been written to the disk, and the remainder has been lost. So, it is safe to use with statements.

To write to a specific encoded text file, pass in the parameter to the open() function encoding and automatically convert the string to the specified encoding.

1With open ('Test.txt','W', encoding='GBK') asFD:2Fd.write ('where are you from, man? ')3 4With open ('Test.txt','R') asFD:5 print (Fd.read ())6 7With open ('Test.txt','R', encoding='Utf-8', errors='Ignore') asFD:8Print (Fd.read ())
1 where are you from, man?  2 ?

Eight, file cursor positioning, tell function and seek function

The tell () method tells you the current position within the file, in other words, the next read and write occurs after so many bytes at the beginning of the file.

The Seek (offset [, from]) method changes the position of the current file. The offset variable represents the number of bytes to move. The from variable specifies the reference position at which to begin moving bytes.

If from is set to 0, this means that the beginning of the file is used as the reference location for moving bytes. If set to 1, the current position is used as the reference location. If it is set to 2, then the end of the file will be used as the reference location.

1 Fd.tell () 2 Fd.seek (00)

Python file io

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.