Tutorial on how to operate byte streams/binary streams using the struct module in Python

Source: Internet
Author: User
Recently, I was studying python network programming. when I was writing a simple socket communication code, I encountered the use of the struct module. at that time, I was not quite clear about the role of this module, later, I checked the relevant information and learned about it. This article mainly introduced the operations of the struct module on the binary stream of byte streams in Python. if you need it, you can refer to it for reference. Recently, I was studying python network programming. when I was writing a simple socket communication code, I encountered the use of the struct module. at that time, I was not quite clear about the role of this module, later, I checked the relevant information and got to know about it. This article mainly introduced the operations of the struct module on the byte stream/binary stream in Python. if you need it, you can refer to it for reference.

Preface

Recently, I used Python to parse the MNIST dataset in the IDX file format and needed to read the binary file. I used the struct module. I checked many tutorials on the Internet and wrote quite well, but it was not very friendly to new users. so I rearranged some notes for quick start.

Note:The following four terms are synonymous: binary stream, binary array, byte stream, and byte array.

Quick start

In the struct module, when an integer, floating point, or character stream (character array) is converted to a byte stream (byte array, you need to use the formatted string fmt to tell the struct module the type of the object to be converted. for example, the integer is 'I', the floating point number is 'f', and an ascii character is's '.

Def demo1 (): # use bin_buf = struct. pack (fmt, buf) sets buf as a binary array bin_buf # use buf = struct. unpack (fmt, bin_buf) returns the bin_buf binary array to the buf # integer-> binary stream buf1 = 256 bin_buf1 = struct. pack ('I', buf1) # 'I' indicates 'integer' ret1 = struct. unpack ('I', bin_buf1) print bin_buf1, '<===>', ret1 # floating point number-> binary stream buf2 = 3.1415 bin_buf2 = struct. pack ('D', buf2) # 'd' indicates 'double' ret2 = struct. unpack ('D', bin_buf2) print bin_buf2, '<===>', ret2 # string-> binary stream buf3 = 'Hello world' bin_buf3 = struct. pack ('11s', buf3) # '11s' indicates the 'string' character array ret3 = struct. unpack ('11s', bin_buf3) print bin_buf3, '<===>', ret3 # struct-> binary stream # assume there is a struct # struct header {# int buf1; # double buf2; # char buf3 [11]; #} bin_buf_all = struct. pack ('id11s', buf1, buf2, buf3) ret_all = struct. unpack ('id11s', bin_buf_all) print bin_buf_all, '<==>', ret_all

The output result is as follows:

Detailed description of struct module

Main functions

The three most important functions in the struct module are:pack(),unpack(),calcsize()

# Encapsulate data into a string (in fact, a byte stream similar to a c struct) based on the given formatted string. string = struct. pack (fmt, v1, v2 ,...) # parse the byte stream string according to the given format (fmt), and return the parsed tupletuple = unpack (fmt, string) # Calculate the given format (fmt) memory offset = calcsize (fmt)

Format string in struct

The following table lists the formats supported by struct:


Format C Type Python Bytes
X Pad byte No value 1
C Char String of length 1 1
B Signed char Integer 1
B Unsigned char Integer 1
? _ Bool Bool 1
H Short Integer 2
H Unsigned short Integer 2
I Int Integer 4
I Unsigned int Integer or lon 4
L Long Integer 4
L Unsigned long Long 4
Q Long Long 8
Q Unsigned long Long 8
F Float Float 4
D Double Float 8
S Char [] String 1
P Char [] String 1
P Void * Long  

Note 1: q and Q are only interesting when the machine supports 64-bit operations.

Note 2: There can be a number before each format, indicating the number

Note 3: The s format indicates a string of a certain length. 4s indicates a string of 4, but p indicates a pascal string.

Note 4: P is used to convert a pointer. Its length is related to the machine length.

Note 5: The last one can be used to indicate the pointer type, which occupies 4 bytes.

In order to exchange data with the struct in c, some c or c ++ compilers use byte alignment, which is usually a 32-bit system in 4 bytes, therefore, struct is converted in byte sequence based on the local machine. you can use the first character in the format to change the alignment. definition:


Character Byte order Size and alignment
@ Native Native makes up 4 bytes
= Native Standard is based on the number of original bytes
< Little-endian Standard is based on the number of original bytes
> Big-endian Standard is based on the number of original bytes
! Network (= big-endian) Standard is based on the number of original bytes

The usage is placed at the first position of fmt, like '@ 5s6sif'

For more information about how to use the struct module to operate byte streams and binary streams in Python, see PHP!

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.