Analyze the mechanism of closing a file operation with a Python script

Source: Internet
Author: User
Tags exit garbage collection python script in python

This article mainly introduces the analysis of the use of Python script to close the file operation mechanism, the author is divided into python2.x version and 3.x version of two cases are elaborated, the need for friends can refer to the

If you don't use "with," when will python close the file? The answer is: depending on the situation.

One of the things that Python programmers learned in the first place was that it was easy to iterate through the full text of an open file through iterative methods:

?

1 2 3 f = open ('/etc/passwd ') for line in F:print (line)

Note that the above code is feasible because our file object "F" is an iterator. In other words, "F" knows what to do in the context of a loop or any other iteration, such as list parsing.

Most students in my Python class have other programming language backgrounds, and when using previously familiar languages, they are always expected to close files when they complete a file operation. So I wasn't surprised when I introduced them to the contents of the Python file, and soon they asked how to close the file in Python.

The simplest answer is that we can explicitly close the file by calling F.close (). Once we close the file, the file object still exists, but we can no longer read the contents of the file through it, and the printable content returned by the file object indicates that the file has been closed.

?

1 2 3 4 5 6 7 8 9 10 11 12 13 14-15 16 >>> f = open ('/etc/passwd ') >>> F <open file '/etc/passwd ', mode ' R ' at 0x10f023270> >>> F.read (5) ' # #n # ' f.close () >>> f <closed file '/etc/passwd ', mode ' R ' at 0x10f023270> F.read (5)------ ---------------------------------------------------------------------ValueError Traceback (most recent call last) <ipython-input-11-ef8add6ff846> in <module> ()----> 1 f.read (5) valueerror:i/o operation on closed file

So, when I was programming in Python, I rarely explicitly called the "close" method on the file. In addition, you probably don't want to or need to do that.

The preferred best practice for opening a file is to use the "with" statement as follows:

?

1 2 3 With the open ('/etc/passwd ') as F:for line into F:print (line)

The "with" statement invokes a method called the "context manager" in Python for the "F" file object. That is, it specifies that "F" is a new file instance pointing to the/etc/passwd content. Within a block of code that is opened by "with", the file is open and can be read freely.

However, once the Python code exits from the "with" code snippet, the file is automatically closed. Attempting to read from F after we exit the "with" code block causes the same ValueError exception as above. So, by using "with," you avoid explicitly closing the file. Python can magically and silently close the file behind the scenes in a less Python-style way.

But what happens when you don't explicitly close the file? If you're a bit lazy, what if you don't use the "with" code block or call F.close ()? When will the file be closed?

I am asking this because I have taught python for so many years that I am convinced that trying to teach "with" or a context manager while teaching many other topics is beyond the scope of student acceptance. When I talk about "with" in the introductory course, I usually tell my students to let Python close the file when it comes to this problem in their careers, regardless of whether the application count of the file object drops to 0 or python exits.

In my python file operation free e-mail course, I didn't use with in all the solutions and wanted to see how. As a result, some people have questioned me, saying that not using "with" will show people a bad practice scenario and that there is a risk that data will not be written to disk.

I got a lot of emails about this topic, so I asked myself: if we didn't explicitly close the file or use the "with" code block, then when would python close the file? That is, if I had the file shut down automatically, what would happen?

I always assume that when the object's reference count drops to 0 o'clock, Python closes the file and the garbage collector cleans the file object. It's hard to prove or verify this when we read the file, but it's easy to write to the file. This is because when the file is written, the content is not immediately flushed to the disk (unless you pass "False" to the third optional argument to the "open" method), it is refreshed only if the file is closed.

So I decided to do some experiments to better understand what Python can do for me automatically. My experiments include opening a file, writing data, deleting references, and exiting Python. I'm curious as to when the data will be written, if any.

My experiment was this way:

?

1 2 3 4 5 6 7 8 f = open ('/tmp/output ', ' W ') f.write (' ABCN ') f.write (' Defn ') # Check contents of/tmp/output (1) del (f) # Check contents o F/tmp/output (2) # exit from Python # Check contents of/tmp/output (3)

I did the first experiment with Python 2.7.9 on the Mac platform, and the report shows that the files in stage one exist but are empty, and that the files in phase two and phase three contain all the content. In this way, my initial instinct in CPython 2.7 seems to be correct: When a file object is garbage collected, its __del__ (or equivalent) method refreshes and closes the file. And calling the "lsof" command in my Ipython process shows that the file was actually closed after the reference object was removed.

What about the Python3? I did the above experiments on Mac on the Python 3.4.2 environment, and got the same result. Removing the last reference to a file object causes the file to be refreshed and closed.

This is good for Python 2.7 and 3.4. But what about the alternative implementation under PyPy and Jython? Maybe things will be different.

So I did the same experiment under the PyPy 2.7.8. And this time, I got a different result! After deleting a reference to a file object-that is, at Phase 2-it does not cause the contents of the file to be brushed into disk. I have to assume that this has to do with the difference between garbage collection mechanisms or other working mechanisms in PyPy and CPython. But if you run the program in PyPy, you never expect the file to be refreshed and closed just because the file object's reference is finished. The command lsof shows that the file will not be released until the python process exits.

For fun, I decided to try Jython 2.7b3. As a result, Jython showed the same behavior as PyPy. In other words, quitting from Python does ensure that data in the cache is written to disk.

I redo these experiments, but I replaced "ABCN" and "Defn" with "ABCN" *1000 and "Defn" *1000.

In the context of Python 2.7, the "ABCN" * 1000 statement is executed without anything written. But after the "defn" * 1000 statement is executed, the file contains 4,096 bytes-possibly representing the size of the buffer. Call Del (f) Deleting a reference to a file object causes the data to be brushed into the disk and the file is closed, when there are 8000 bytes of data in the file. So Python 2.7 behaves essentially the same way if you ignore the string size. The only difference is that if the size of the buffer is exceeded, some data is written to disk before the final file closes the data flush.

If it were Python 3, the situation would be a little different. No data is written after the F.write is executed. However, once the file object reference is finished, the file is refreshed and closed. This may be a big reason for the buffer zone. But there is no doubt that deleting a file object reference causes the file to be refreshed and closed.

As for PyPy and Jython, the results are the same for both large and small files: The file is refreshed and closed at the end of the PyPy or Jython process, rather than at the end of the reference to the file object.

In order to confirm again, I used "with" to carry on the experiment. In all cases, we can easily predict when the file will be refreshed and closed-when the code snippet is exited, and the context manager calls the appropriate method in the background.

In other words, if you do not use "with", then at least in very simple circumstances, your data may not necessarily have the risk of loss. However, you are still not sure whether the data is saved at the end of the file object reference or when the program exits. If you assume that because the only reference to a file is a local variable, the file will close when the function returns, and the fact must surprise you. If you have multiple processes or threads that write to a file at the same time, you really need to be very careful.

Perhaps this behavior can be better defined not to be able to perform on different platforms basically consistent? Maybe we can even look at the beginning of the Python specification, not the CPython saying "yeah, no matter how the version is always right."

I still think "with" and context Manager are great. And I think it's hard to understand the working principle of "with" for the Python novice. But I still have to remind novice developers that if they decide to use other alternative versions of Python, there will be a lot of odd things that are different from CPython and if they are not careful enough, they may even suffer.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.