If you call the Read () method directly on a large file object, it causes unpredictable memory consumption. A good approach is to use fixed-length buffers to continuously read the contents of the file. That is through yield.
When using Python to read a two multi-g txt text, naïve direct use of the ReadLines method, the result of a running memory will be collapsed.
Fortunately colleagues to the next, with yield method, tested under no pressure. The reason for this is that the readlines is to put all the text content in memory, and yield is similar to the generator.
The code is as follows:
def open_txt (file_name): with open (file_name, ' r+ ') as F: When True: line = F.readline () if not line:< C4/>return yield Line.strip ()
Invoke instance:
For text in Open_txt (' Aa.txt '): print Text
Example two:
The target TXT file is about 6G, want to take out the previous 1000 data saved in a new TXT file to do the rest of the operation, although do not know whether it is necessary, but the small amount of data to test it first. Refer to this post: I want to save a list to a TXT document, how to save, I wrote a simple applet.
====================================================
Import Datetimeimport Picklestart = Datetime.datetime.now () print "start--%s"% (start) FileHandle = open (' train.txt ') fil E2 = open (' S_train.txt ', ' w ') i = 1while (i < 10000): a = Filehandle.readline () file2.write (". Join (a)) I = i + 1filehandle.close () file2.close () print "done--%s"% (Datetime.datetime.now ()-start) if __name__ = = ' __main__ ':
pass
====================================================
Pickle This library everyone said a lot, the official website to see, the back can be a good study.