: This article mainly introduces how base64 works and uses PHP to implement it. if you are interested in the PHP Tutorial, refer to it. Developers must be familiar with Base64 encoding, and it is not necessary to have a clear understanding of it. Actually, Base64 is too simple to be simple. if you still have an ambiguous understanding of it, you should not. Let's take a look at the Base64-related content. it takes a few minutes to fully understand it. The following part of this article is a Base64 decoder, which allows you to experiment while reading the article.
I. Base64 encoding origin
Why is Base64 encoded? Because some network transmission channels do not support all bytes, for example, traditional mail only supports transmission of visible characters, control characters like ASCII codes cannot be transmitted through mail. In this way, the usage is greatly limited. each byte of the binary stream cannot be all visible characters, so it cannot be transmitted. The best way is to make an extension scheme to support the transfer of binary files without changing the traditional protocol. The problem is solved by indicating that non-printable characters can also be used. Base64 encoding came into being. Base64 is a representation of binary data based on 64 printable characters.
II. Base64 encoding principles
Let's take A look at the Base64 index table, with 64 printable characters: "a-Z, A-z, 0-9, +, and. The value indicates the character index, which is specified by the standard Base64 protocol and cannot be changed. 64 characters can be expressed in 6 bits. a single byte has 8 bits, and the remaining two bits are wasted, so we have to sacrifice some space. Here we need to understand that a Base64 character is 8 bits, but the valid part is only 6 bits on the right, and the two on the left are always 0.
How can we use six valid bits to represent eight bits of traditional characters? The minimum public multiples of 8 and 6 are 24. that is to say, three traditional bytes can be represented by four Base64 characters to ensure that the valid digits are the same, in this way, the number of bytes is 1/3 more to make up that Base64 has only 6 valid bits. You can also say that two Base64 characters can also represent a traditional character, but using the minimum public multiple solution is actually the least waste. It is easier to understand with the following figure. Man is a three-character string with a total of 24 valid bits. Therefore, 4 Base64 characters must be used to collect 24 valid bits. The red box indicates the corresponding Base64, and the six valid bits are converted to the corresponding index value and then corresponding to the Base64 signature table. The Base64 character corresponding to "Man" is "TWFU ". Speaking of this, there is a principle that you don't know if you find it. the minimum unit to convert to Base64 is three bytes. for a string, each conversion is three bytes and three bytes, it corresponds to four bytes of Base64. This is actually the case.
But what should you do if you find that the conversion is not three bytes at the end? The desire is finally realized. we can use two Base64 to represent A character or three Base64 to represent two characters. for example, A corresponds to only two binary bits of the second Base64, just fill in the four following zeros. Therefore, the Base64 character corresponding to A is QQ. As mentioned above, the principle is that the minimum unit of Base64 characters is a group of four characters. then, there are only two characters, followed by two "=. In fact, you don't need to use "=" or delay decoding. the reason for using "=" may be that the Base64 string after multi-segment encoding will not cause confusion. It can be seen that only one or two "=" may appear at the end of the Base64 string, and "=" may not appear in the middle. The encoding process for the Chinese character "BC" is the same.
III. Summary
Speaking of Base64 encoding, it may be a bit strange, because most of the encoding is converted from a character to a binary, and the conversion from a binary to a character is called decoding. The Base64 concept is reversed. the conversion from binary to character is called encoding, and the conversion from character to binary is called decoding.
Base64 encoding is mainly used in transmission, storage, binary representation, and other fields. It can also be used for encryption. However, this encryption is simple, but it does not seem to know anything at a glance, of course, you can also customize the character sequence of Base64 for encryption.
Base64 encoding is a process from binary to character. for example, when some Chinese characters are converted into binary by using different encodings, the resulting binary is different, so the resulting Base64 characters are different. For example, "surfing the Internet" corresponds to Base64 encoding in UTF-8 format "5LiK572R", and Base64 encoding in GB2312 format is "yc/N + A = ".
function base64encode($str) { $base64 = [ "A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "M", "N", "O", "P", "Q", "R", "S", "T", "U", "V", "W", "X", "Y", "Z", 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z', '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '+', '/' ]; $len = strlen($str); $rstr = ""; for($j=0; $j<$len/3; $j++) { $item = substr($str, $j*3, 3); $itemlen = strlen($item); $eightbit = ""; for($i=0; $i<=$itemlen; $i++) { $bin[$i] = decbin(ord($item[$i])); $combin[$i] = str_pad($bin[$i], 8, "0", STR_PAD_LEFT); $eightbit .= $combin[$i]; } for ($i = 0; $i <= $itemlen; $i++) { $sixbit = substr($eightbit, $i * 6, 6); $rstr .= $base64[bindec($sixbit)]; } $pad = ["==", "=", ""]; $rstr .= $pad[$itemlen-1]; } return $rstr;}echo base64_encode("Maxwelldu");$r = base64encode("Maxwelldu");echo $r;
The above introduces the principles of base64 and uses PHP to implement it, including some content. I hope to be helpful to anyone interested in PHP tutorials.