Atitit. Base64 encoding principle and implementation design, atitit. base64 encoding
Atitit.Base64 encoding principle and implementation design
1. Base64 encoding 1
1.1. Why use your own base64 encoding scheme 1
2. Base64 encoding origin 1
3. Base64 encoding Principle 1
3.1. Specifically, the conversion method can be divided into four steps: 2
3.2. Note 2
3.3. Padding3
4. URL-safe Base64 encoding 3
1.
Base64 encoding1.1.
Why use your own base64 encoding scheme?
Prevent jar conflicts in apache codec jdk.
2.
Base64 encoding Origin
Base64 was first used to solve the problem of email transmission.
Traditional emails set technical specifications in 1982. For details, see rfc0822. An important feature of this specification is that only ASCII printable characters can be used for e-mails. This results in different non-English characters or binary files (than slices) being transmitted via email.
Author ::★(Attilax)> nickname: old wow's paw (full name: Attilax Akbar Al Rapanui Attila Akba Arla Panui) Chinese name: AI long, EMAIL: 1466519819@qq.com
Reprinted please indicate Source: http://www.cnblogs.com/attilax/
3.
Base64 encoding Principle
To put it simply, base64 encoding selects 64 characters from the ASCII code-UPPERCASE letters A-Z, lowercase letters a-z, numbers 0-9, symbols "+", "/" (plus "= ", is actually 65 characters), as a basic character set. All other symbols are then converted into characters in this character set.
3.1.
Specifically, the conversion method can be divided into four steps:
1. Divide every three bytes into a group of 24 binary digits: 3*8 = 24
2. Divide the 24 binary bits into four groups, each of which has 6 binary bits: 24/4 = 6
3. Add two 00 before each group and extend it to 32 binary bits, that is, 4 Bytes: 4*(6 + 2) = 32
4.
According to the following encoding table, the corresponding symbol of each byte after expansion is obtained, which is the Base64 encoding value.
5.
Value Encoding
0 A 17 R 34 I 51 z
1 B 18 S 35 j 52 0
2 C 19 T 36 k 53 1
3 D 20 U 37 l 54 2
4 E 21 V 38 m 55 3
5 F 22 W 39 n 56 4
6G 23X40 o 57 5
7 H 24 Y 41 p 58 6
8 I 25 Z 42 q 59 7
9 J 26 a 43 r 60 8
10 K 27 B 44 s 61 9
11 L 28 c 45 t 62 +
12 M 29 d 46 u 63/
13 N 30 e 47 v
14 O 31 f 48 w (pad) =
15 P 32g 49 x
16 Q 33 h 50 y
6.
3.2.
Note:
1. Because the maximum two digits of each character after conversion are 0, the actual valid digits are 6 characters, that is, 26 = 64 characters, and all the encodings can be overwritten.
2. if the remaining characters are less than 3 bytes, fill them with 0 and use "=" for the output character. Therefore, one or two "=" may appear at the end of the encoded text ".
3. Because Base64 converts three bytes into four bytes, Base64 encoded text is about 1/3 larger than the original text.
3.3.
Padding
Base64 is the encoding and conversion of three Bytes (Bytes) as a group (24-bit block). If the number of Bytes is not a multiple of three, in this case, the last group contains only one or two bytes and is processed according to the following rules:
1. in the case of one byte: Convert the eight binary bits of this Byte into two groups according to the six binary bits in each group. In addition to adding the first two zeros to the last group, add 4 more zeros. In this way, a two-digit Base64 encoding is obtained, and then two "=" signs are added at the end.
2. in the case of two bytes: convert a total of 16 binary digits of the two bytes into three groups based on six binary digits in each group. The last group includes two zeros in addition to the first one, add two zeros. In this way, a three-digit Base64 encoding is obtained, and a "=" sign is added at the end.
Refer to base64 architecture Image
4.
URL-safe Base64 encoding
Because the '+', '/' characters have special meanings in the URL, the Base64 encoded data must be escaped through URL encoding during URL transmission, however, this will cause the URL to become meaningless and lead to an additional URL codec step. To avoid this problem, there is an improved Base64 encoding variant for URL, in fact, it simply changes '+' and '/' in the standard Base64 to '-' and '_' respectively '_'. For the fill character '=', some variants remove it directly, and some replace it '.'.
Base64 encoding learning notes-jianshu .html