Base64 encoding, base64
Base64Is a group of similar binary-to-text encoding schemes that represent binary data in an ASCII string format by translating it into a radix-64 representation.
Base64 is a method that uses 64 printable characters to represent any binary data and cannot be used for encryption.
Why is Base64 encoded?
To save non-character data (url (data: image/png; base64 ,...)), Alternatively, when the system only supports ASCII characters to save data that is not ASCII characters, you can encode the data in binary format into Base64 format for storage.
128 ~ The value between 255 is an invisible character. During transmission of plain text protocol, it may be incorrectly processed as a control character, causing transmission failure. Encode all the codes into visible characters to reduce the possibility of errors.
Base64 encoding
Base64Encoding table
Code value |
Character |
|
|
Code value |
Character |
|
Code value |
Character |
|
Code value |
Character |
0 |
A |
|
26 |
A |
52 |
0 |
62 |
+ |
... |
... |
|
... |
... |
... |
... |
63 |
/ |
Encoding example:
Source ASCII ('If <128) |
M |
A |
N |
Source octets |
77 (0x4d) |
97 (0x61) |
110 (0x6e) |
Bit pattern |
0 |
1 |
0 |
0 |
1 |
1 |
0 |
1 |
0 |
1 |
1 |
0 |
0 |
0 |
0 |
1 |
0 |
1 |
1 |
0 |
1 |
1 |
1 |
0 |
Index |
19 |
22 |
5 |
46 |
Base64-encoded |
T |
W |
F |
U |
Encoded octets |
84 (0x54) |
87 (0x57) |
70 (0x46) |
117 (0x75) |
If the binary data to be encoded is not a multiple of 3, one or two bytes are left after grouping. Base64 uses \ x00 to fill in a group at the end, and then adds one or two equal signs at the end of the encoding to indicate how many bytes are added to a group.
When one byte is added: Add two \ x00 bytes after the byte to form a group, and then convert the byte to the 4-byte of base64 according to the above rules. The first two digits of each byte are fixed to 0, and two "=" signs are added at the end of Base64 encoding, indicating that two bytes are supplemented.
For example, the letter "M" is a byte and can be converted into two groups of 00010011 and 00010000. The corresponding Base64 values are T and Q, and then the two "=" numbers are supplemented, therefore, the Base64 encoding of "M" is TQ =.
Text content |
M |
Add \ x00 to complement a group |
Add \ x00 to complement a group |
ASCII |
77 (0x4d) |
0 (0x00) |
0 (0x00) |
Bit pattern |
0 |
1 |
0 |
0 |
1 |
1 |
0 |
1 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
Index |
19 |
16 |
0 |
0 |
Base64-encoded |
T |
Q |
= |
= |
When there are two more bytes: add one \ x00 after the two bytes to form a group, and then convert them to 4 bytes of base64 according to the above rules, add a "=" at the end.
For example, the string "Ma" is a string of two bytes, which can be converted into three groups: 00010011, 00010110, and 00000100. The corresponding Base64 values are T, W, and E, respectively, add a "=", so the Base64 encoding of "Ma" is TWE =.
Text content |
M |
A |
|
ASCII |
77 (0x4d) |
97 (0x61) |
0 (0x00) |
Bit pattern |
0 |
1 |
0 |
0 |
1 |
1 |
0 |
1 |
0 |
1 |
1 |
0 |
0 |
0 |
0 |
1 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
Index |
19 |
22 |
4 |
0 |
Base64-encoded |
T |
W |
E |
= |
Base64 encoding for Chinese Characters
First, you need to obtain binary data and then convert it according to the preceding rules. Different Chinese encoding methods correspond to different binary values. If they are not uniform, garbled characters may occur.
"Strict" UTF-8 encoding is 3 bytes E4B8A5, and binary is three bytes"111001001011100010100101 ", and then convert to the Base64 value 5Lil.