A simple way to read and write binary files using Python (recommended) _python

Source: Internet
Author: User
Tags pack unpack in python

The general feeling is thatPython itself does not support the binary, but provides a module to make up for, that is, the struct module.

Python does not have a binary type, but it can store binary types of data, and that is to store binary data with string literals, which is fine, because the string is in 1 bytes.

Import struct

a=12.34

#将a变为二进制

Bytes=struct.pack (' I ', a)

At this point bytes is a string that is the same byte as the binary store of a.

And then reverse the operation

The existing binary data is bytes, (in fact, a string), which in turn converts it to the Python data type:

A,=struct.unpack (' i ', bytes)

Note that unpack is returning tuple.

So if there's only one variable:

Bytes=struct.pack (' I ', a)

Well, that's what it takes to decode.

A,=struct.unpack (' i ', bytes) or (A,) =struct.unpack (' I ', bytes)

If the a=struct.unpack (' I ', bytes) is used directly, then a= (12.34,) is a tuple rather than the original floating-point number.

If it's made up of multiple data, you can do this:

A= ' Hello '

b= ' world!

' c=2

d=45.123

bytes=struct.pack (' 5s6sif ', a,b,c,d)

At this point the bytes is binary form of data, you can write directly to the file such as Binfile.write (bytes)

Then, when we need it, we can read it again, Bytes=binfile.read ()

and decoding it into Python variables via struct.unpack ()

A,b,c,d=struct.unpack (' 5s6sif ', bytes)

The word ' 5s6sif ', called FMT, is a format string, consisting of a number of characters, 5s represents a 5-character string, 2i, 2 integers, and so on, and the following are the available characters and types, and CType representations can correspond to type one by one in Python.

Format C Type Python Number of bytes
X Pad byte No value 1
C Char string of length 1 1
B signed Char Integer 1
B unsigned Char Integer 1
? _bool bool 1
H Short Integer 2
H unsigned Short Integer 2
I Int Integer 4
I unsigned int Integer or Long 4
L Long Integer 4
L unsigned Long Long 4
Q Long Long Long 8
Q unsigned long long Long 8
F Float Float 4
D Double Float 8
S Char[] String 1
P Char[] String 1
P void  * Long

The last one that can be used to represent pointer types, 4 bytes

In order to exchange data with the struct in C, it is also considered that C or C + + compilers use byte alignment, typically a 4-byte 32-bit system, and therefore provide

Order
Character ByteSize and Alignment
@ Native Native enough for 4 bytes.
= Native Standard by the original number of bytes
< Little-endian Standard by the original number of bytes
> Big-endian Standard by the original number of bytes
! Network (= Big-endian) Standard by the original number of bytes

The use method is placed in the first position of FMT, just like ' @5s6sif '

-----binary File Processing problems-----

When we work with binary files, we need to use the following methods

Binfile=open (filepath, ' RB ') read binary files

Or

Binfile=open (filepath, ' WB ') writes binary files

So what's the difference with the results of Binfile=open (filepath, ' R ')?

There are two different places:

First, if you encounter ' 0x1A ' when using ' R ', it will be considered as the end of the file, which is EOF. There is no such problem with ' RB '. That is, if you use binary writing to read the text again, if there is ' 0X1A ' in it, you will only read a portion of the file. Use ' RB ' to read the end of the file all the time.

Second, for the string x= ' abc/ndef ', we can use Len (x) to get its length to be 7,/n we call it a line break, which is actually ' 0X0A '. When we write in ' W ' that is text, it automatically turns ' 0X0A ' into two characters ' 0X0D ', ' 0X0A ' in the Windows platform, which means that the file length actually becomes 8. When read in ' R ' text, it is automatically converted to the original line break. If it is written in ' WB ' binary mode, it will keep one character unchanged and read as is. So if you write in text and read in binary form, consider this extra byte. ' 0X0D ' is also known as a return character.
Linux does not change. Because Linux only uses ' 0X0A ' to represent line wrapping.

The above is a simple method of using Python to read and write binary files (recommended) is to share all the content of the small, hope to give you a reference, but also hope that we support the cloud habitat community.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.