BASE64 Coding Principle Analysis

Source: Internet
Author: User
Tags control characters

Base64 is one of the most common encoding methods for transmitting 8Bit bytes of code on the network, and before understanding BASE64 encoding, you should understand several basic concepts: bits, bytes.

Bit: bit is the smallest unit of data in the computer. The status of each bit can only be 0 or 1;

Bytes: 8 bits constitute 1 bytes (bytes), which is the basic unit of measure for storage space. 1 bytes can store 1 English letters, 2 bytes can store 1 characters;

The role of BASE64 encoding

Because some network transport channels do not support all bytes, such as traditional mail only supports the transmission of visible characters, such as ASCII code control characters can not be transmitted by mail. This is a big limitation, and it is not possible to send all the bytes of the binary stream as visible characters. The best way to do this is to open up a new scheme that supports the delivery of binary files without changing the traditional protocol. The invisible character is represented by the visible characters. Base64 is a representation of binary data based on 64 visible characters.

Extension: Invisible characters are not actually displayed, but these characters are not displayed on the screen, such as: line break, carriage return, backspace ... Character.

The principle of Base64 coding

Base64 can encode ASCII strings or binary encodings to contain only a-z,a-z,0-9,+,/64 characters (26 uppercase letters, 26 lowercase letters, 10 digits, 1 +, one/just 64 characters). These 64 characters can be fully represented with 6 bit bits, and a byte has 8 bit bits, so there are two bit bits left, and the two bit bits are added with zero. In fact, a Base64 character is still 8 bit bit, but the valid part only has 6 bit on the right, and the left two is always 0.

The encoding rule for Base64 is to encode 3 8-bit bytes (3x8=24 bits) into 4 6-bit bytes (4x6=24 bits), then add 6 two in front of each 0-bit byte and Form 4 8-bit bytes, then the range of values becomes 0~63. And because 2 of the 6 is equal to 64, so every 6 bits constitute a unit.

Extension: 1, why value range is 0~63?

You can review the binary conversion 10 binary method:

Minimum binary: 00000000 The result of converting to 10 binary is 0;

The largest binary: 00111111 The result of converting to 10 binary is:

0x27+0x26+1x25+1x24+1x23+1x22+1x21+1x20 = 63

The Base64 converts 3 bytes to 4 bytes, so the encoded amount of code (in bytes) is approximately 1/3 more than the amount of code before encoding. If the amount of code is exactly 3 of an integer multiple, then it is 1/3 more. But if not, then, when the extra code is not an integer multiple of 3, the remainder of the code divided by 3 is 2 or 1. Conversion, the result is not enough 6 bits to fill the corresponding position with the zero, and then in the 6-bit front two 0. The result of the empty output is to use the "=" to complement the position, in short, to ensure that the last encoding of the number of bytes is a multiple of 4.

2, Why to ensure that the last encoded byte number is a multiple of 4?

Because Base64 encodes 3 bytes into 4 bytes, the resulting byte count must be a multiple of 4

One of the main purposes of Base64 encoding is to display any character in a "visual" character. The string is first split into six-bit binary (the first two-bit 0), so that the range of each character is between 0-63. Then use the BASE64 Code table, the value range of 0-63 characters into a "visual" character. If you do not add 0 or only one 0, then the value range will be 0-255 or 0-127,BASE64 encoding table will be re-defined.

Extension: Why is the value range limited to 0~63 instead of 0~255 or 0~127?

estimate the visible characters are limited and there are not so many visible characters or Base64 coded rules, conventions.

is the BASE64 encoding table, the value represents the index of the character, this is the standard BASE64 protocol, cannot be changed.

Example:

Example 1:

Character: SLF

Corresponding ASCII code: s:83 l:76 f:70

Convert to the corresponding binary:

83:01010011, 76:01001100, 70:01000110

To explain more clearly, for example:

The results are correct by Base64 on-line coding verification.

Example 2:

Character: M

Corresponding ASCII code: m:77

Convert to the corresponding binary:

77:01001101

Conversion Result:

The results are correct by Base64 on-line coding verification.

Summary: BASE64 encoding is not the real encryption method, it is only from the binary to the character conversion process, said BASE64 encoding is the encryption method, just because after Base64 code, let a person look not to know what content only.

BASE64 Coding Principle Analysis

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.