About Base64 encoding

Source: Internet
Author: User
Tags printable characters rfc

Author: Tang Feng

Base 64 is an old encoding method, which is very common in communication. It is easy to implement.

What?

"Base64Is a representation of binary data based on 64 printable characters (from Wikipedia )". I didn't understand this sentence at first. Now I can explain it in a way I understand: We can divide communication data streams into two types: "binary stream" and "Text Stream ". (Note that the subsequent definitions are not rigorous ). A text stream is a data string consisting of "human-readable characters". data such as 0x00, 0x0a, and 0x0d in a data stream are generally special control data (the text ends, or line feed, or other), rather than the data itself. A binary stream is any string of data (each byte can be a value from 0x00 to 0xff, not limited to characters ). Base64 encoding is to use printable characters (A-Z, a-z, 0-9, +/these 64 "characters ") an encoding method used to represent any binary stream data. That is: Any binary stream, encoded by base64 will become a string of only visible characters (A-Z, a-z, 0-9, +/these 64 "characters ") text data streams.

For the complete base64 definition, see RFC 1421 and RFC 2045.

Why?

What is Base64 used ?! The data stream after base 64 encoding is longer than the original data stream (4/3 of the original data stream length, for the reason, see How). Why do we need to use this encoding method? (What I did not understand at first)

The reason is: there are many communication devices with different designs. For example, some devices can only process/transmit 7-bit data, and some communication modules can only process text streams, to ensure that any binary data (in 8 bits) is still accessible between these devices, the data is reencoded into a text stream consisting of only printable characters. The printable characters selected in base64 are supported on almost any device (character sets use US-ASCII standards ). Another common scenario is that data is input or output from the console (or other input/output devices). Since these input/output devices only support text operations or display, therefore, binary data streams must be converted (codec) between printable character sets ).

How?

Base64 encoding is actually very simple. The binary data streams are grouped by a group of three characters (24bit). After grouping, the 24bit is divided into 4 6-bit data records, and the first two digits of 0 are added to each 6-bit data, 4 8-bit data (changed to 4 bytes) will be obtained, and the value of each byte is between 0 and 63. Then convert these values into printable characters based on the preset conversion table.

The content of Wikipedia referenced in the previous article is also clear.

Process reference:

Vector <char> Bin2Base64 (vector <uint8_t> const & _ bin ){
Static char const convert_table [] = "ABCDEFGHIGKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789 + /";

Vector <char> output;
Output. reserve (_ bin. size () * 4)/3 );

Auto con_3bytes_to_4bytes = [&] (uint8_t const * _ 3bytes_bin_data ){
Output. push_back (convert_table [_ 3bytes_bin_data [0]> 2]);
Output. push_back (convert_table [(_ 3bytes_bin_data [0] & 0x03) <4) | (_ 3bytes_bin_data [1]> 4) & 0x0f)]);
Output. push_back (convert_table [(_ 3bytes_bin_data [1] <2) & 0x3c) | (_ 3bytes_bin_data [2]> 6)]);
Output. push_back (convert_table [_ 3bytes_bin_data [2] & 0x3f]);
};

Auto I = 0u;
For (; (I + 3) <= _ bin. size (); I + = 3 ){
Con_3bytes_to_4bytes (& _ bin [I]);
}

If (I! = _ Bin. size ()){
Uint8_t left_data [3] = {0 };
Std: copy (begin (_ bin) + I, end (_ bin), left_data );
Con_3bytes_to_4bytes (left_data );
Std: fill (rbegin (output), rbegin (output) + (3-(_ bin. size () % 3), '= ');
}

Return output;
}

Vector <uint8_t> Base64ToBin (vector <char> const & _ base64stream ){
Vector <uint8_t> output;
Output. reserve (_ base64stream. size () * 3/4 );

Static hash_map <char, uint8_t> convert_table = {
{'A', 0}, {'B', 1}, {'C', 2}, {'D', 3}, {'E', 4 }, {'F', 5}, {'G', 6}, {'h', 7 },
{'I', 8}, {'J', 9}, {'k', 10}, {'l', 11}, {'M', 12 }, {'n', 13}, {'O', 14}, {'P', 15 },
{'Q', 16}, {'R', 17}, {'s', 18}, {'T', 19}, {'U', 20 }, {'V', 21}, {'w', 22}, {'x', 23 },
{'Y', 24}, {'Z', 25}, {'A', 26}, {'B', 27}, {'C', 28 }, {'D', 29}, {'E', 30}, {'F', 31 },
{'G', 32}, {'h', 33}, {'I', 34}, {'J', 35}, {'k', 36 }, {'l', 37}, {'M', 38}, {'n', 39 },
{'O', 40}, {'P', 41}, {'Q', 42}, {'R', 43}, {'s', 44 }, {'T', 45}, {'U', 46}, {'V', 47 },
{'W', 48}, {'x', 49}, {'y', 50}, {'Z', 51}, {'0', 52 }, {'1', 53}, {'2', 54}, {'3', 55 },
{'4', 56}, {'5', 57}, {'6', 58}, {'7', 59}, {'8', 60 }, {'9', 61}, {'+', 62}, {'/', 63 },
{'=', 0 },
};

Auto I = 0u;
For (; I <_ base64stream. size (); I + = 4 ){
Auto const byte1 = convert_table [_ base64stream [I];
Auto const byte2 = convert_table [_ base64stream [I + 1];
Auto const byte3 = convert_table [_ base64stream [I + 2];
Auto const byte4 = convert_table [_ base64stream [I + 3];

Output. push_back (byte1 <2) | (byte2> 4 ));
Output. push_back (byte2 & 0x0f) <4) | (byte3> 2 ));
Output. push_back (byte3 & 0x03) <6) | byte4 );
}

Output. erase (end (output)-count (rbegin (_ base64stream), rbegin (_ base64stream) + 4, '= '),
End (output ));

Return output;
}

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.