"Go from: Http://www.ibm.com/developerworks/cn/linux/sdk/python/python-5/index.html#N1004E"
When we talk about "text processing", we usually refer to what we are dealing with. Python is easy to read the contents of a text file into a string variable that can be manipulated. The file object provides three read methods:. Read (),. ReadLine (), and. ReadLines (). Each method can accept a variable to limit the amount of data read at a time, but they usually do not use a variable. read () reads the entire file at a time, and is typically used to place the contents of the file in a string variable. However, the. Read () generates the most direct string representation of a file's content, but it is unnecessary for sequential row-oriented processing, and is not possible if the file is larger than available memory.
. ReadLine () and. ReadLines () are very similar. They are all used in structures similar to the following:
Python. ReadLines () example
FH = open (' C:\\autoexec.bat ') for line in fh.readlines ():
print Line
The difference between ReadLine () and. ReadLines () is that the latter reads the entire file at once, like. Read (). ReadLines () automatically analyzes the contents of the file into a list of rows that can be used by Python for ... Structure for processing. On the other hand,. ReadLine () reads only one row at a time, usually much slower than. ReadLines (). You should use the. ReadLine () only if there is not enough memory to read the entire file at once.
Python 3 has only Unicode STR, so the Decode method is removed.
Python 2 reads files by default in a byte stream (corresponding to the Python 3 bytes), unlike Python 3, which is decoded as Unicode by default. If the contents of the file are not Unicode encoded, you must first open it in binary mode, read the bit stream, and then decode it.
/tmp/python3
Python 3.2.3 (default, Feb 2013, 14:44:27)
[GCC 4.7.2] on linux2
Type ' help ', ' copyright ', ' C Redits "or" license for the more information.
>>> f1 = open ("Unicode.txt", ' R '). Read ()
>>> print (F1)
Cold
>>> F2 = open (" Unicode.txt ", ' RB '). Read () #二进制方式打开
>>> Print (F2)
B ' \xe5\xaf\x92\xe5\x86\xb7\n '
>>> F2.decode ()
' cold \ n '
>>> f1.decode ()
Traceback (most recent call last):
File "<stdin > ", Line 1, in <module>
attributeerror: ' str ' object has no attribute ' decode '