Python uses the base64 module for binary data encoding.

Source: Internet
Author: User

Python uses the base64 module for binary data encoding.

Preface

Yesterday, my team asked me about the POP3 protocol. So today I have studied the POP3 protocol format and the poplib in Python. Some of the data transmitted back by the POP server needs to be decoded using Base64, so I checked the base64 module in Python by the way.

This document describes the base64 module, which provides functions related to Base16, Base32, Base64, Base85, and Ascii85 encoding and decoding. The content of the poplib module will be published later. Well, I have dug another trap. I cannot fill it out in my life...

The following content is taken from example:

Due to historical reasons, some component systems on the Internet only support 7-bit character transfer, while the internal character transfer is 8-bit, when sending Chinese Characters in an email, it will change all 1 of the eighth dollar in the text into 0.
Take the Chinese character as an example. When the HEX is A4A4A4E5, when the highest bit is cleared, it becomes 24242465, that is, "$ e ". Telnet also has this issue.

In addition to the Chinese component, this issue also occurs when you use an electronic component to send audio clips, programs, and audio files. Therefore, in the case of electronic sub-devices, we usually solve this problem by using various batch copying methods. 8 bits will be processed according to certain rules, you can pass the 7-bit character verification system in good condition.

Normally, the firmware contains UU and MIME, while the MIME (Multipurpose Internet Mail Extentions) is generally converted to the multi-media delivery mode, it lists the cases that can be sent to multiple media, and can be sent together with various types of cases in a mail.

MIME defines two upload Methods: Base64 and QP (Quote-Printable, the QP rules mean that the 7bits in the data does not have to record the encode, And the 8bits is converted into 7 bits. QP was used in non-US-ASCII text content, such as our Chinese case, while Base64's simplified rules were to repeat the entire case, convert to 7 bits, which is used when sending binary data. The difference between the two modes affects the size of the case after the initial modification. Some of the cooperations will always use Base64 encoding.

Base64

The base64 module provides six functions for Base64 encoding and decoding, which can be divided into three groups.

Base64.b64encode (s, altchars = None)
Base64.b64decode (s, altchars = None, validate = False)

The parameter s represents the data to be encoded/decoded. The type of b64encode parameter s must be bytes ). The b64decode parameter s can be a byte packet (bytes) or a string (str ).

Base64 encoded data may contain '+' or '/'. If the encoded data is used in the url or file system path, a Bug may occur. Therefore, the base64 module provides a method to replace '+' and '/' in the encoded data.

The altchars parameter must be a 2-byte packet. These two symbols are used to replace '+' and '/' in the encoded data '/'. This parameter is set to None by default.

The default value of validate is False. If it is True, the base64 module first checks s for any non-base64 alphabet characters before decoding, and if so, throws the Error binascii. Error: Non-base64 digit found.

If the data length is Incorrect, an Error binascii. Error: Incorrect padding is thrown.

>>> import base64>>> x = base64.b64encode(b'test')>>> xb'dGVzdA=='>>> base64.b64decode(x)b'test'

Base64.standard _ b64encode (s)
Base64.standard _ b64decode (s)

This group of functions will directly pass the parameter s to the previous Group of functions.

Base64.urlsafe _ b64encode (s)
Base64.urlsafe _ b64decode (s)

These functions are also based on the first group of functions, but after encoding, they replace '+' and '/' in the output data with '-' and '_'. Before decoding, replace '-' and '_' in the data with '+' and '/'.

In addition, Base64 encoding also produces a symbol '=', which is used to fill the Data Length in multiples of 4.

Base32

Base64.b32encode (s)
Base64.b32decode (s, casefold = False, map01 = None)

The parameter s is consistent with Base64.

The Base32 encoded character range is [2-7a-z], which does not support lowercase letters. However, when the casefold parameter is True, Base32 decoding can accept lowercase letters. However, for the sake of security, the default value of this parameter is False.

Base32 decoding also allows you to replace 0 with the uppercase letter O and 1 with the uppercase letter I or L. The map01 parameter specifies the character to replace number 1 (the source code does not have to be one of the letters I or L). If this parameter is not set to None, the number 0 is always replaced with the letter O. For security reasons, this parameter is set to None by default.

Base16

Base64.b16encode (s)
Base64.b16decode (s, casefold = False)

The Base16 encoded character range is [0-9A-F].

Parameters s and casefold are used in the same way as Base32.

Base85

Base64.b85encode (B, pad = False)
Base64.b85decode (B)

Parameter B is the data used for encoding/decoding. The type must be the same as the parameter s of Base64.

When pad is set to True, B '\ 0' is used before encoding to fill the data with a multiple of 4. However, the filled data is not removed during decoding.

This group of functions is added after Python3.4.

Ascii85

Base64.a85encode (B, *, foldspaces = False, wrapcol = 0, pad = False, adobe = False)

Parameter B is the data used for encoding, And the type must be bytes.

When the foldspaces parameter is True, B 'y' is used to represent four consecutive spaces.

The wrapcol parameter is an integer. When wrapcol is not 0, this integer controls the number of characters in the output after encoding to add a linefeed B '\ n '.

When the pad parameter is set to True, data is prefixed with B '\ 0' to a multiple of 4. The filled data is not removed during decoding.

The parameter adobe specifies whether the data is in Adobe format. Adobe Ascii85 encoding data by <\~ And \ ~> If this parameter is set to True, the pair of symbols will be added to the returned data.

Base64.a85decode (B, *, foldspaces = False, adobe = False, ignorechars = B '\ t \ n \ r \ V ')

Parameter B is the data used for encoding. The type can be bytes or str.

When the foldspaces parameter is True, B 'y' is used to represent four consecutive spaces.

The parameter adobe specifies whether the data is in Adobe format. Adobe Ascii85 encoding data by <\~ And \ ~> If this parameter is set to True, base64 removes the pair before decoding.

The ignorechars parameter specifies the characters to be ignored during decoding. By default, all spaces in ASCII are included.

This group of functions is added after Python3.4.

As mentioned in the official documentation of the base64 module, Base85 and Ascii85 use 5 characters to encode 4 bytes, base64 uses 6 characters to encode 4 bytes (actually 4 characters to encode 3 bytes). When the space is not sufficient, the first two will be more efficient than Base64.

Old API

Base64 still retains some old APIs for some special purposes.

Base64.encode (input, output)
Base64.decode (input, output)

This set of functions uses a binary file as the data source and writes the encoded/decoded data to the binary file.

Base64.encodebytes (s)
Base64.decodebytes (s)

Both encodebytes and b64encode internally call the b2a_base64 of the binascii module, except that the newline parameter uses the default value True when encodebytes calls b2a_base64. That is to say, when the encodebytes outputs data, a linefeed B '\ n' is added for every 76 bytes '.

Decodebytes is basically the same as b64decode under the default parameter. Decodebytes only supports data of the bytes type.

Base64.encodestring (s)
Base64.decodestring (s)

This group of functions is discarded after Python3.1. Currently, the previous Group of functions will be called directly.

Summary

The base64 module provides an interface for encoding binary data, including the standard Base64, Base32, Base16, And the fact standard Ascii85 and Base85. By studying this module, I learned a lot about the details of binary data encoding. Sometimes we think that we know computers and the Internet. In fact, what everyone sees is nothing more than nothing. This field is still unknown to me, and I will not stop exploring it.

The above is all the details about how to use the base64 module for binary data encoding in Python. I hope it will be helpful to you. If you are interested, you can continue to refer to other related topics on this site. If you have any shortcomings, please leave a message. Thank you for your support!

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.