Mime mail encoding method

Source: Internet
Author: User
[Reprint] mime mail encoding method

Posted on spring flower club read (399) Comments (0) EDIT favorites From: http://book.csdn.net/bookfiles/402/10040214756.shtml

Mime mail encoding method

Because each ASCII character only occupies one byte (8 bit), and the maximum bit is 0, that is, the actual information in the ASCII characters is only the next seven low bit characters, while the traditional SMTP protocol is based on the ASCII character design. Therefore, some SMTP servers designed based on the traditional SMTP protocol only take 7 low bit bits in each byte for processing when processing mail content, and ignore the highest bit. Obviously, this SMTP server may cause serious problems when processing the mail content containing non-ASCII characters, which limits that only English ASCII characters can appear in the mail, chinese characters or binary data are not allowed.

To include non-ASCII characters such as Chinese characters, images, or sounds in the mail content, people think of some encoding method to convert non-ASCII characters into printable ASCII characters before sending them, the email reader restores the original data from the email according to the corresponding decoding method. The two commonly used email encoding methods are base64 and quoted-printable. Later, the Extended SMTP protocol allowed binary data to be transmitted directly in the mail, instead of being encoded in the mail. People call the 8bit encoding of the binary data that is not encoded in the mail, to make a difference, people call a 7-bit email that does not undergo a pure ASCII code character encoding. The message encoding method of the mime message body is specified by the content-transfer-encoding header field in the mime message header. The following describes the encoding methods for each message:

-7bit

The message body contains all unencoded ASCII characters.

-8bit

The message body contains non-ASCII characters and unencoded raw data. Currently, the mail server supports 8-bit encoding. Using an 8-bit encoding email server can simplify the mail processing process.

-Base64

Base64 is the most common encoding method for converting binary data into printable ASCII characters. The basic principle of base64 is to group A set of continuous bytes by 6 bits, each group of data is represented by an ASCII character. A maximum of six bits can represent 26 = 64 values. Therefore, you can use 64 ASCII characters to correspond to these 64 values. The 64 ASCII characters are:

"Abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz0123456789 + /"

The value of each character is the index number in the above arrangement, and the index number starts from 0. Assume that the memory contains the following three consecutive Bytes:

[0110,0001] [0110,0010] [0110,0011]

The 6-bit grouping format is as follows:

[, 00] [,] [] []

After grouping, four groups of data are obtained. Each group of data corresponds to four decimal values: 24, 22, 9, and 35, which correspond to the four characters y, W, J, and J, respectively, therefore, after base64 encoding is performed on the data in the three bytes [0110,0001] [0110,0011], the result is "ywjj ".

Base64 encoding requires that the data of three 8-bit bytes (that is, 24 bits) be converted into 4 6-bit bytes (that is, 24 bits, if the number of 8 bytes of data cannot be divided by 3, the remaining number can only be 1 or 2, how can we process the remaining 1 or 2 8-bit bytes of data? In this case, the remaining bytes are still grouped by 6 bits. After the last six bits, add a few bits of 0 to make up six bits, for example, if the remaining 8-byte content is as follows:

[0110,0001]

The result of grouping is as follows:

[0110,00] [01,0000]

The 0 marked in the black italic is the bit filled. Therefore, the BASE64 encoding result of the remaining byte is "YQ ". BASE64 encoding also requires that if the number of characters in the encoded result text is not an integer multiple of 4, you need to fill the "=" character at the end to make it a multiple of 4. Therefore, add two "=" characters after the final encoding result, that is, "YQ = ". Obviously, if the last two 8-byte contents are left, it can be encoded into three characters, and a "=" character must be added. When encoding a large segment of data with BASE64, you can add a carriage return line break to the appropriate position in the encoding result. The MIME standard recommends that each line in the BASE64 encoding result contain a maximum of 76 characters.

-Quoted-printable

Quoted-printable is also a way to convert binary data into printable ASCII characters. It does not convert ASCII characters, but only encodes non-ASCII characters. Each byte of non-ASCII characters is converted into a "=" followed by the hexadecimal data of this byte. For example, the Quoted-printable encoding result of "AB China" is "AB = d6 = d0 = b9 = fa ". Obviously, because "=" has special significance in Quoted-printable encoding, the "=" character in the original data also needs to be converted, it is represented by "= 3d.

When Quoted-printable encoding is performed on a large segment of data, you can add a carriage return line break to the appropriate position in the encoding result, and add an additional "=" character before the carriage return line break, it indicates that the next line feed is a soft carriage return caused by encoding, rather than the original line feed of the original data. For example, for the following section of Quoted-printable encoded data:

= D5 = E2 = CA = C7 = CD = A8 = D0 = C5 = B5 = C4 = B3 = CC = D0 =

= F2, = C7 = EB = D6 = B8 = BD = CC!

The "=" character at the end of the first line and the line feed are generated after encoding.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.