Python Read File summary

Source: Internet
Author: User

Python Read File summary

You want to read text or data from a file through Python.

The most convenient way is to read all the contents of the file once and place it in a large string:
All_the_text = open (' Thefile.txt '). All text in the read () # text file
All_the_data = open (' Abinfile ', ' RB '). Read () # All data in binary file
For security reasons, it's a good idea to specify a name for the open file object so that you can close the file quickly after the operation is complete, preventing some useless file objects from taking up memory. For example, read from a text file:
File_object = open (' Thefile.txt ')
Try
All_the_text = File_object.read ()
Finally
File_object.close ()
You don't have to use the try/finally statement here, but it works better because it guarantees that the file object is closed even if there is a critical error in the read.
Two. The simplest, fastest, and most Python-style approach is to read the contents of the text file row by line and place the read data into a list of strings:
List_of_all_the_lines = File_object.readlines ()
This reads the end of each line of text with the "\ n" sign; If you don't want to, there is another alternative, such as:
List_of_all_the_lines = File_object.read (). Splitlines ()
List_of_all_the_lines = File_object.read (). Split (' \ n ')
List_of_all_the_lines = [L.rstrip (' \ n ') for L in File_object]
The simplest and quickest way to process a text file by line is to use a simple for loop statement:
For line in File_object:
Process Line
This method also leaves the "\ n" symbol at the end of each line, adding a sentence to the body part of the For loop:
Lineline = Line.rstrip (' \ n ')
Or, if you want to remove the whitespace at the end of each line (not just ' \ n ' \), the common approach is:
Lineline = Line.rstrip ()
Three. Discussion
Unless the file to be read is very large, it is the quickest and most convenient way to read all of the contents in memory and further process them at once. The built-in function open creates a Python file object (alternatively, you can create a file object by invoking the built-in type file). You call the Read method on the object to read everything (whether text or binary data) and put it in a large string. If the content is text, you can choose to use the Split method or a more specialized splitlines to cut it into a list of rows. Because slicing a string into a single line is a common requirement, you can also call ReadLines directly on the file object for more convenient and faster processing.
You can either apply a looping statement directly to a file object, or pass it to a processor that requires an iterative object, such as List or Max. When it is treated as an iterative object, each line of text in a file object that is opened and read becomes an iterative subkey (so this applies only to text files). This step-by-step iterative processing method is very memory-saving resources, the speed is also good.
In Unix or Unix-like systems, such as Linux,mac OS X, or other BSD variants, text files and binaries are not really different. In Windows and the old Macintosh system, line breaks are not standard ' \ n ' and are ' \ r \ n ' and ' \ R ' respectively. Python will help you turn these line breaks into ' \ n '. This means that when you open a binary file, you need to tell Python explicitly so that it doesn't do any conversions. To achieve this, the second parameter of ' RB ' to open must be passed. There is no downside to doing this on a class UNLX platform, and it is a good practice to always differentiate between text files and binaries, which are not mandatory on those platforms, of course. But these good habits can make your program more readable, easier to understand, and better platform compatible.
If you are unsure what line breaks a text file will use, you can set the second argument of open to ' RU ' and specify a generic newline character conversion. This gives you the freedom to swap files on Windows, UNIX (including Mac OS X), and other old Macintosh platforms without worrying about any problems: no matter what platform your code runs on, various line breaks are mapped to ' \ n '.
You can call the Read method directly on the file object produced by the open function, as shown in the first code fragment given in the solution. When you do this, you lose the reference to that file object as you complete the read. In practice, Python notices the instant loss of reference on the spot, and it quickly closes the file. However, a better approach is to specify a name for the result of open, so that when you have finished processing, you can explicitly close the file yourself. This ensures that the file remains open for as short a time as possible, even on Jython,ironpython or other variant Python platforms where the advanced garbage collection mechanism may delay automatic recycling, unlike the current C-based Python platform, CPython will perform the recovery immediately). The try/finally statement should be used to ensure that the file object can be closed correctly even if the processing error occurs, which is a robust and rigorous process.
File_object = open (' Thefile.txt ')
Try
For line in File_object:
Process Line
Finally
File_object.close ()
Note that the call to open is not put into the try clause of the try/finally statement (this is a common mistake for beginners). If an error occurs when the file is opened, there is nothing to close, and there is no substance bound to the name File_object, and of course it should not be called file_object.close ().
If you choose to read a small portion of a file at a time, instead of all, the way is a little different. Here is an example of reading 100 bytes of a binary file at a time, reading to the end of the file:
File_object = open (' Abinfile ', ' RB ')
Try
While True:
Chunk = file_object.read (100)
If not chunk:
Break
Do_something_with (Chunk)
Finally
File_object.close ()
Passing a parameter n to the Read method ensures that the Read method reads only the next n bytes (or less if the read position is already close to the end of the file). When the end of the file is reached, read returns an empty string. Complex loops are best encapsulated into reusable generators (generator). For this example, we can only encapsulate part of its logic because the yield keyword of the generator (generator) is not allowed to appear in the try clause of the try/finally statement. If you want to discard the try/finally statement's protection against file shutdown, we can do this:
def read_file_by_chunks (filename, chunksize=100):
File_object = open (filename, ' RB ')
While True:
Chunk = File_object.read (chunksize)
If not chunk:
Break
Yield Chunk
File_object.close ()
Once the read_file_by_chunks generator is complete, the code that reads and processes the binaries at a fixed length can be extremely simple to write:
For chunk in Read_file_by_chunks (' Abinfile '):
Do_something_with (Chunk)
The task of reading a text file line by row is more common. Just apply the loop statement to the file object, as follows:
For line in open (' Thefile.txt ', ' RU '):
Do_something_with (line)
In order to 100% ensure that no useless open file objects exist after the completion of the operation, you can modify the above code to be more tightly secured:
File_object = open (' Thefile.txt ', ' RU '):
Try
For line in File_object:
Do_something_with (line)
Finally
File_object.close ()

Python Read File summary

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.