Overall, Python itself does not support binary, but it provides a module to compensate for it, which is the struct module.
Python does not have a binary type, but it can store binary types of data, that is, using string string types to store binary data, which is fine because the string is in 1 bytes.
Import struct
a=12.34
#将a变为二进制
Bytes=struct.pack (' I ', a)
At this point, Bytes is a string literal, which is the same as the binary storage of a byte in bytes.
And then reverse the operation.
The existing binary data bytes, which is actually a string, translates it into a Python data type:
A,=struct.unpack (' i ', bytes)
Note that the unpack returns a tuple
So if there is only one variable:
Bytes=struct.pack (' I ', a)
Well, that's what it takes to decode.
A,=struct.unpack (' i ', bytes) or (A,) =struct.unpack (' I ', bytes)
If you use A=struct.unpack directly (' I ', bytes), then a= (12.34,) is a tuple instead of the original floating-point number.
If it is composed of multiple data, you can:
A= ' Hello ' b= ' world! ' C=2d=45.123bytes=struct.pack (' 5s6sif ', a,b,c,d)
At this point the bytes is the binary form of the data, you can write directly to the file such as Binfile.write (bytes)
Then, when we need to, we can read it again, Bytes=binfile.read ()
Then decode the python variable by struct.unpack ()
A,b,c,d=struct.unpack (' 5s6sif ', bytes)
' 5s6sif ' is called FMT, which is a formatted string, consisting of numbers plus characters, 5s representing a 5-character string, 2i, representing 2 integers, and so on, the following are the available characters and types, and the CType representation can correspond to type one by one in Python.
Format |
C Type |
Python |
Number of bytes |
X |
Pad byte |
No value |
1 |
C |
Char |
string of length 1 |
1 |
B |
Signed Char |
Integer |
1 |
B |
unsigned char |
Integer |
1 |
? |
_bool |
bool |
1 |
H |
Short |
Integer |
2 |
H |
unsigned short |
Integer |
2 |
I |
Int |
Integer |
4 |
I |
unsigned int |
Integer or Long |
4 |
L |
Long |
Integer |
4 |
L |
unsigned long |
Long |
4 |
Q |
Long Long |
Long |
8 |
Q |
unsigned long long |
Long |
8 |
F |
Float |
Float |
4 |
D |
Double |
Float |
8 |
S |
Char[] |
String |
1 |
P |
Char[] |
String |
1 |
P |
void * |
Long |
|
The last one can be used to represent a pointer type, accounting for 4 bytes
In order to exchange data with structs in C, it is also necessary to consider that some C or C + + compilers use byte alignment, typically 32-bit systems in 4-byte units, and therefore provide
Character |
Byte Order |
Size and Alignment |
@ |
Native |
Native enough 4 bytes |
= |
Native |
Standard by original number of bytes |
< |
Little-endian |
Standard by original number of bytes |
> |
Big-endian |
Standard by original number of bytes |
! |
Network (= Big-endian) |
Standard by original number of bytes |
The use method is placed in the first position of the FMT, just like ' @5s6sif '
-----problems encountered while processing binary files-----
When we work with binary files, we need to use the following methods
Binfile=open (filepath, ' RB ') read the binary file
Or
Binfile=open (filepath, ' WB ') write binary files
So what's the difference between the results and Binfile=open (filepath, ' R ')?
The difference is two places:
First, if you touch ' 0x1A ' when using ' R ', it will be considered as the end of the file, which is EOF. There is no problem with ' RB '. That is, if you use binary writing to read the text again, if there is ' 0X1A ' in it, only a portion of the file will be read. Using ' RB ' will always read the end of the file.
Second, for the string x= ' abc/ndef ', we can use Len (x) to get its length to 7,/n what we call a newline character, which is actually ' 0X0A '. When we write with ' W ' as text, the ' 0X0A ' is automatically changed to two characters ' 0X0D ', ' 0X0A ', that is, the length of the file actually becomes 8 in the Windows platform. When read with the ' R ' text, it is automatically converted to the original newline character. If you replace it with a ' WB ' binary, it will keep one character intact and read as is. So if you write it in text and read it in binary mode, consider the extra byte. ' 0X0D ' is also called carriage return.
Linux does not change. Because Linux uses only ' 0X0A ' to represent line breaks.
The above this article uses Python to read and write binary files simple method (recommended) is the small part to share all the content of everyone, I hope to give you a reference, but also hope that we support topic.alibabacloud.com.