Initial Exploration of base64

Source: Internet
Author: User

What is base64?

According to rfc2045, base64 is defined as base64 content Transfer Encoding. It is designed to describe the 8-bit bytes of any sequence as a form that is not easily recognized by people.

 

Why use base64?

When designing this code, I think the designers mainly consider three issues:
1. encryption?
2. complexity and efficiency of encryption algorithms
3. How to Handle transmission?

Encryption is positive, but the purpose of encryption is not to allow users to send very secure emails. This encryption method is mainly used to "Prevent the gentleman from defending against the villain ". That is, you can see nothing at a glance.
The complexity and efficiency of encryption algorithms for this purpose cannot be too large or too low. Similar to the previous reason, the mime protocol and other protocols used to send emails solve the problem of how to send and receive emails, rather than how to send and receive emails safely. Therefore, the complexity of the algorithm is small and the efficiency is high. Otherwise, resources are greatly occupied by email sending, and the path is a bit distorted.

Algorithm details

Base64 encoding requires that three 8-bit bytes (3*8 = 24) be converted into four 6-bit bytes (4*6 = 24 ), then add two zeros before the six bits to form the 8-bit one-byte format.
Specific conversion formats:
String "3"
11010101 11000101 00110011

00110101 00011100 00010100 00110011
Table 1

Consider this: connect 8-bit bytes into a string of 110101011100010100110011
Then, six values are selected in order, and then two zeros are added before the six binary numbers to form a new byte. Then select 6 more, add 0, and so on until all 24 binary numbers are selected.

Let's take a look at the actual results:

String "3"
11010101 HEX: D5 11000101 HEX: C5 00110011 HEX: 33

00110101 00011100 00010100 00110011
'5' character '^ \' character '^ t' character '3'
Decimal 53 decimal 34 decimal 20 decimal 51
Table 2

In this case, is the string "3" represented as "5 ^ \ ^ T3" by base64 ?. Error!
Base64 encoding is not simply based on the converted content. The character '^ \' is a control character and cannot be displayed on a computer. In some cases, it cannot be used. Base64 has its own encoding table:

Table 1: The base64 alphabet
Value encoding value Encoding
0 A 17 R 34 I 51 Z
1 B 18 S 35 J 52 0
2 C 19 t 36 K 53 1
3 D 20 u 37 L 54 2
4 E 21 V 38 m 55 3
5 F 22 W 39 n 56 4
6g 23x40 o 57 5
7 H 24 y 41 P 58 6
8 I 25 Z 42 Q 59 7
9 J 26 A 43 R 60 8
10 K 27 B 44 s 61 9
11 l 28 C 45 t 62 +
12 m 29 D 46 U 63/
13 N 30 E 47 V (PAD) =
14 O 31 F 48 W
15 p 32G 49 x
16 Q 33 H 50 y
Table 3

This is also the origin of the base64 name, And the base64 encoding result is not changed to the data because the encoding is 0 for the higher two digits and 6 for the lower two, but to the form shown in the table above, for example, "A" has seven digits, while "A" has only six digits. In the table, the encoded number corresponds to the decimal value of the new byte. Therefore, we can obtain the corresponding base64 encoding from table 2:

String "3"
11010101 HEX: D5 11000101 HEX: C5 00110011 HEX: 33

00110101 00011100 00010100 00110011
'5' character '^ \' character '^ t' character '3'
Decimal 53 decimal 34 decimal 20 decimal 51
Character '1' character 'I' character 'U' character 'Z'
Table 4

In this way, the string "3" is encoded into the string "1iuz.
Base64 converts three bytes into four bytes. Therefore, the amount of code after encoding (in bytes, the same below) is about 1/3 more than the amount of code Before encoding. The reason is "about" is that if the code size is exactly three integer times, it is naturally 1/3 more. But what if not?
Careful people may have noticed that the last character in the base64 alphabet has a (PAD) = character. This character is used to solve this problem.
When the code volume is not an integer multiple of 3, the remainder of the Code volume/3 is 2 or 1. During conversion, if the result is less than six digits, 0 is used to fill in the corresponding position, and then two zeros are added before the six digits. After the empty output result is converted, "=" is used to fill the bits. For example, if the remaining result is two bytes of "sheets ":

String "Zhang"
11010101 HEX: D5 11000101 HEX: C5

00110101 00011100 00010100
Decimal 53 decimal 34 decimal 20 pad
Character '1' character 'I' character 'U' character '='
Table 6

In this way, the last two bytes are sorted into "1iu = ".
Similarly, if the original code has only one byte left, two "=" will be added ". Only in these two cases, the base64 encoding can end with two "=" at most"
Decoding base64 is just a simple inverse process of encoding. You can discuss it yourself. I will give the decoding algorithm at the end of the article.

 

Algorithm Implementation
In fact, the algorithm details are basically clear. It is used in the program and can be divided into the following steps except the constraints:
Read 3 bytes of data, use and to take the first 6 digits, and move the first two digits to the right of the new variable, take the first 2 bits of the first byte and the first 4 bits of the second byte into the new variable and move the two bits to the right ...... And so on.
The algorithm implemented by C-language decoding:
Byte lmovebit (INT base, int movenum)
{
Byte result = base;
If (movenum = 0) return 1;
If (movenum = 1) return movenum;
Result = base <(MoveNum-1 );
Return result;
}

Char base64_alphabet [] =
{'A', 'B', 'C', 'D', 'E', 'E', 'F', 'G', 'h', 'I ', 'J', 'k', 'l', 'M', 'n', 'O', 'P ',
'Q', 'R', 's', 't', 'U', 'V', 'w', 'x', 'y', 'z ', 'A', 'B', 'C', 'D', 'E', 'E', 'F ',
'G', 'h', 'I', 'J', 'k', 'l', 'M', 'n', 'O', 'P ', 'Q', 'R', 's', 't', 'U', 'V ',
'W', 'x', 'y', 'z', '0', '1', '2', '3', '4', '5 ', '6', '7', '8', '9', '+', '/', '= '};
Byte base64decode (char * base64code, DWORD base64length)
{
Char Buf [4];
Int I, J;
Int K;
Int L = 0;
Byte temp1 [4], temp2;
Byte * buffer = new byte [base64 length * 3/4];
DWORD base64a = (base64length/4)-1;
DWORD base64b = 0;
For (; base64b <base64a + 1; base64b ++)
{
For (I = 0; I <4; I ++)
{
Buf [I] = * (base64code + (base64b * 4) + I );
For (j = 0; j <65; j ++)
{
If (BUF [I] = base64_alphabet [J])
{
Temp1 [I] = J;
Break;
}
}
}
I --;
For (k = 1; k <4; k ++)
{
If (temp1 [I-(k-1)] = 64) {m_padnum ++; continue ;}
Temp1 [I-(k-1)] = temp1 [I-(k-1)]/lmovebit (2, (k-1) * 2 );
Temp2 = temp1 [I-K];
Temp2 = temp2 & (lmovebit (2, K * 2)-1 );
Temp2 * = lmovebit (2, 8-(2 * k); // move 4
Temp1 [I-(k-1)] = temp1 [I-(k-1)] + temp2;
Buffer [base64b * 3 + (3-k)] = temp1 [I-(k-1)];
}
}
Return buffer;
}

According to this algorithm, the e-mail content provided at the beginning of the article can be decoded:
Hello, snaix

This is a base64 test email!

Java Algorithm:

Package javamail;

Import java. Io. bufferedreader;
Import java. Io. ioexception;
Import java. Io. inputstreamreader;

Import sun. Misc. base64encoder;

Public class base64util {
Public static void main (string [] ARGs) throws ioexception {
Base64encoder encoder = new base64encoder ();
System. Out. println ("Please input Username :");
String username = new bufferedreader (New inputstreamreader (system. In). Readline ();
System. Out. println (encoder. encode (username. getbytes ()));
System. Out. println ("Please input password :");
String passwd = new bufferedreader (New inputstreamreader (system. In). Readline ();
System. Out. println (encoder. encode (passwd. getbytes ()));
}
}

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.