Sometimes you need to use python to process binary data, such as accessing files and socket operations. At this time, you can use the python struct module to complete the process. You can use struct to process struct in C language.
The three most important functions in the struct module are pack (), unpack (), and calcsize ()
Pack (FMT, V1, V2,...) encapsulates data into strings in a given format (actually a byte stream similar to a C struct)
Unpack (FMT, string) parses the byte stream string according to the given format (FMT) and returns the parsed tuple.
Calcsize (FMT) calculates the number of bytes of memory occupied by a given format (FMT ).
The following table lists the formats supported by struct:
> Format |
C type |
Python type |
> X |
Char |
None (fill byte) |
> C |
Char |
String with a length of 1 |
> B |
Signed Char |
Integer |
> B |
Unsigned char |
Integer |
> H |
Short |
Integer |
> H |
Unsigned short |
Integer |
> I |
Int |
Integer |
> I |
Unsigned int |
Long |
> L |
Long |
Integer |
> L |
Unsigned long |
Long |
> Q |
Long long |
Long |
> Q |
Unsigned long |
Long |
> F |
Float |
Float |
> D |
Double |
Float |
> S |
Char [] |
String |
> P |
Char [] |
String |
> P |
Void * |
Integer |
Note 1. Q and q are only interesting when the machine supports 64-bit operations
Note 2. There can be a number before each format, indicating the number
Note: The 3. s format indicates a string of a certain length. 4S indicates a string of 4, but P indicates a Pascal string.
Note 4. P is used to convert a pointer. Its length is related to the machine's word length.
By default, struct is converted according to the byte sequence of the Local Machine. However, you can use the first character in the format to change the alignment mode. The definition is as follows:
> Character |
Byte order |
Length and alignment |
> @ |
Native |
Native |
> = |
Native |
Standard |
> < |
Little-Endian |
Standard |
> |
Big-Endian |
Standard |
>! |
Network (= big-Endian) |
Standard |
With struct, we can easily operate binary data.
For example, there is a struct:
struct Header
{
unsigned short id;
char[4] tag;
unsigned int version;
unsigned int count;
}
The above struct data is received through socket. Recv, which exists in string S. Now we need to parse it. You can use the Unpack () function.
import struct
id, tag, version, count = struct.unpack("!H4s2I", s)
In the format string above ,! It indicates that we want to use network byte sequence resolution, because our data is received from the network and transmitted over the network. h indicates an unsigned short ID, 4S indicates a 4-byte long string, and 2I indicates that there are two unsigned int types of data.
Through an unpack, we have saved our information in ID, Tag, version, and count.
Similarly, you can easily pack local data into the struct format.
ss = struct.pack("!H4s2I", id, tag, version, count);
The pack function converts ID, Tag, version, and count into struct headers in the specified format. SS is now a string (actually a byte stream similar to a C struct) and can use socket. send (SS) sends this string.