When reading and writing files in any language, everyone knows that there are the following modes:
R, Rb, W, WB
So what are the main differences between reading and writing files with or without the B logo?
1. File Usage ID
- 'R': the default value, which indicates reading data from a file.
- 'W': indicates the data to be written to the file and the previous content is truncated.
- 'A': indicates to write data to the file and add it to the end of the current content.
- 'R + ': indicates that the file can be read and written (delete all previous data)
- 'R + a': indicates that the file can be read and written (added to the end of the current file)
- 'B': indicates the binary data to be read and written.
2. Read files
When reading a file, the file is read to the end of the file only after the file is read. Python considers the byte \ x1a (26) the converted character is the document terminator (EOF). Therefore, when 'R' is used to read binary files, incomplete reading may occur.
For example, the binary file contains the following data sorted from the low position to the high bit: 7f 32 1A 2f 3D 2C 12 2E 76 if 'R' is used for reading, it reads the third byte, that is, the end of the file. If 'rb' is used for Binary reading, the read bytes are not converted into characters, thus avoiding the preceding errors. Solution: Read 'rb' in binary mode. When 'R' is used, '0x1a 'is regarded as the end of the file, that is, EOF. This problem does not exist when 'rb' is used. That is, if binary data is written and read by a file, if '0x1a 'exists, only part of the file is read, using 'rb' will always read the end of the file.
3. Write files
For string x = 'abc \ ndef ', we can use Len (X) to get its length of 7. \ n is called a line break, which is actually 0x0a. When we use 'W' as the text writing method, '0x0a' is automatically changed to two characters '0x0d' and '0x0a' on Windows ', that is, the file length is actually 8. When reading in 'R' text, it is automatically converted to the original line break.
If it is written in the 'wb 'binary format, it will keep a character unchanged and read it as it is.
Therefore, if you write data in text and read data in binary mode, consider the extra byte. '0x0d' is also called a carriage return.
Linux does not change, because Linux only uses '0x0a' to indicate line breaks.