A tutorial on how to operate a byte stream/binary stream in a struct module in Python

Source: Internet
Author: User
Tags unpack
Recently in the study of Python network programming this piece, in writing a simple socket communication code, encountered the use of the struct module, it was not clear at that time this has and effect, and later consulted the relevant information about, This article mainly introduces the operation of the struct module in Python, which can be used for reference by friends.

Objective

The recent use of Python to parse the mnist dataset in the IDX file format requires a read operation on the binaries, where I am using a struct module. Check the online quite a lot of tutorials are written very good, but not very friendly to the novice, so I re-organized some notes to get started quickly.

Note: the following four nouns are synonymous in the tutorial: binary streams, binary arrays, byte streams, bytes arrays

Get started quickly

In a struct module, when converting an integer number, floating-point number, or character stream (an array of characters) to a byte stream (an array of bytes), you need to use the format string fmt to tell the struct module what type of object is being converted, such as the integer number is ' I ', the floating-point number is ' F ', An ASCII code character is ' s '.

def demo1 (): # Use BIN_BUF = Struct.pack (FMT, buf) to buf a binary array bin_buf # use BUF = Struct.unpack (FMT, bin_buf) to reverse the Bin_buf binary array back into B UF # integer---binary Stream buf1 = BIN_BUF1 = Struct.pack (' i ', buf1) # ' I ' stands for ' integer ' Ret1 = Struct.unpack (' i ', bin_buf1) prin  T bin_buf1, ' <====> ', Ret1 # floating point, binary stream buf2 = 3.1415 bin_buf2 = struct.pack (' d ', buf2) # ' d ' stands for ' double ' Ret2 = Struct.unpack (' d ', bin_buf2) print bin_buf2, ' <====> ', Ret2 # string-and-binary stream buf3 = ' Hello world ' bin_buf3 = Stru Ct.pack (' 11s ', buf3) # ' 11s ' stands for a length of 11 ' string ' character array ret3 = Struct.unpack (' 11s ', bin_buf3) print bin_buf3, ' <====> ', R Et3 # struct-B binary Stream # Suppose there is a struct # struct header {# int buf1; # double buf2; # char buf3[11]; #} Bin_buf_all = Struct.pac K (' id11s ', Buf1, Buf2, buf3) Ret_all = Struct.unpack (' id11s ', bin_buf_all) print Bin_buf_all, ' <====> ', Ret_all

The output results are as follows:


Demo1 Output Results

Detailed struct module

Main functions

The three most important functions in a struct module are pack() , unpack()calcsize()

# wraps the data into a string (actually a byte stream similar to the c struct) string = Struct.pack (FMT, V1, v2, ...) according to the given format string. # resolves a byte stream in the given format (FMT), returns the parsed tupletuple = Unpack (FMT, String) # calculates the memory offset = calcsize (FMT) that is taking up the given format (FMT)

A formatted string in a struct

The supported formats in a struct are the following table:


Format C Type Python Number of bytes
X Pad byte No value 1
C Char string of length 1 1
B Signed Char Integer 1
B unsigned char Integer 1
? _bool bool 1
H Short Integer 2
H unsigned short Integer 2
I Int Integer 4
I unsigned int Integer or Lon 4
L Long Integer 4
L unsigned long Long 4
Q Long Long Long 8
Q unsigned long long Long 8
F Float Float 4
D Double Float 8
S Char[] String 1
P Char[] String 1
P void * Long  

Note 1:q and q are only interesting when the machine supports 64-bit operation

Note 2: There can be a number in front of each format, indicating the number of

Note 3:s format represents a certain length of string, 4s represents a string of length 4, but p represents a Pascal string

Note 4:p is used to convert a pointer whose length is related to the machine word size

Note 5: The last one can be used to represent a pointer type, accounting for 4 bytes

In order to exchange data with structs in C, it is also necessary to consider that some C or C + + compilers use byte alignment, usually 32-bit systems in 4 bytes, and therefore structs are converted according to the local machine byte order. You can change the alignment by using the first character in the format. defined as follows:


Character Byte Order Size and Alignment
@ Native Native enough 4 bytes
= Native Standard by original number of bytes
< Little-endian Standard by original number of bytes
> Big-endian Standard by original number of bytes
! Network (= Big-endian) Standard by original number of bytes

The use method is placed in the first position of the FMT, just like ' @5s6sif '

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.