Python uses the email module to encode and decode emails.

Source: Internet
Author: User

Python uses the email module to encode and decode emails.

Decode emails
The email module provided by python is very interesting. It can be used to encode and decode emails and is very useful for processing emails.
Processing a mail is a very meticulous task, especially decoding a mail, because its format has changed too much. Let's take a look at the source file of a mail:

Received: from 192.168.208.56 ( 192.168.208.56 [192.168.208.56] ) byajax-webmail-wmsvr37 (Coremail) ; Thu, 12 Apr 2007 12:07:48 +0800 (CST)Date: Thu, 12 Apr 2007 12:07:48 +0800 (CST)From: user1 <xxxxxxxx@163.com>To: zhaowei <zhaoweikid@163.com>Message-ID: <31571419.200911176350868321.JavaMail.root@bj163app37.163.com>Subject: =?gbk?B?u+nJtA==?=MIME-Version: 1.0Content-Type: multipart/Alternative;   boundary="----=_Part_21696_28113972.1176350868319"------=_Part_21696_28113972.1176350868319Content-Type: text/plain; charset=gbkContent-Transfer-Encoding: base64ztLS0b+qyrzS1M6qysfSu7j20MfG2ru70ru0zqOs1K3AtMrH0ru49tTCtffSu7TOztLDx8/W1NrTprjDysew67XjssXE3MjI1ebC6bezICAg------=_Part_21696_28113972.1176350868319Content-Type: text/html; charset=gbkContent-Transfer-Encoding: quoted-printable<DIV>=CE=D2=D2=D1=BF=AA=CA=BC=D2=D4=CE=AA=CA=C7=D2=BB=B8=F6=D0=C7=C6=DA=BB==BB=D2=BB=B4=CE=A3=AC=D4=AD=C0=B4=CA=C7=D2=BB=B8=F6=D4=C2=B5=F7=D2=BB=B4=CE=</DIV><DIV>=CE=D2=C3=C7=CF=D6=D4=DA=D3=A6=B8=C3=CA=C7=B0=EB=B5=E3=B2=C5=C4=DC=C8==C8</DIV><DIV>=D5=E6=C2=E9=B7=B3</DIV>------=_Part_21696_28113972.1176350868319--

The above is the source file of the mail, from the first line to the first blank line as the mail header, followed by the letter body. Copy the above information and save it to a file named xxx. eml. Double-click the file to view the content. Of course, the decoded content is decoded by outlook.
Let's see how the email module handles this email. Assume that the email has been saved as xxx. eml.

#-*-Encoding: gb2312-*-import emailfp = open ("xxx. eml "," r ") msg = email. message_from_file (fp) # create a message object in the direct file. At this time, the subject = msg will be decoded. get ("subject") # The subject in the receiving object header, that is, the topic # The following three lines of code are only for decoding image =? Gbk? Q? = CF = E0 = C6 = AC? = Such subjecth = email. header. header (subject) dh = email. header. decode_header (h) subject = dh [0] [0] print "subject:", subjectprint "from:", email. utils. parseaddr (msg. get ("from") [1] # get fromprint "to:", email. utils. parseaddr (msg. get ("to") [1] # get tofp. close ()

This code parses the subject, sender, and recipient of an email. Email. utils. parseaddr is used to resolve the mail address, because the mail address is often written in the original text as follows: user1 <xxxxxxxx@163.com>, email. utils. parseaddr can resolve it to a list, the first item is user1, the second item is the xxxxxxxx@163.com, and only the following part is displayed here.
The code above only parses the mail header, and then parses the mail body. The body may contain plain text plain and html, or attachments. Mime knowledge is required here. For details, you can search for it on the Internet. I won't talk about it here. Let's take a look at how to resolve it:

#-*-Encoding: gb2312-*-import emailfp = open ("xxx. eml "," r ") msg = email. message_from_file (fp) # The data block of each mime in the loop letter for par in msg. walk (): if not par. is_multipart (): # If you want to determine whether it is a multipart, the data in it is useless. Why can you understand mime knowledge. Name = par. get_param ("name") # if it is an attachment, the file name of the attachment will be retrieved here. if name: # There are attachments # The following three lines of code are only used to decode the image =? Gbk? Q? When cf1_e01_c61_ac.rar? = The file name h = email. header. header (name) dh = email. header. decode_header (h) fname = dh [0] [0] print 'attachment name: ', fname data = par. get_payload (decode = True) # decode the attachment data and store it in the file. try: f = open (fname, 'wb ') # Be sure to use wb to open the file, because the attachment is generally a binary file containing invalid characters in the attachment T: print, the attachment name will be changed to 'f = open ('aaa', 'wb') f. write (data) f. close () else: # It is a text content print par, not an attachment. get_payload (decode = True) # decode the text and output it directly. Print '+' * 60 # used to differentiate the output of each part

Simply put, there is not much code to implement the complex email resolution function!

Encoded email
It is easy to use the email module to generate emails, but it only requires some basic mime knowledge. Let's take a look at the mime basics.
A mime Message consists of a message header and a message body, which are the mail header and body. The header and body are separated by blank lines. You can use a text editor (such as NotePad) to view the source file of an email. Outlook and foxmail provide the ability to view source files.
The mail header contains important information such as the sender, recipient, subject, time, MIME Version, and mail content type. Each piece of information is called a domain, which is composed of ":" and information content after the domain name. It can be a row, long or occupying multiple rows. The first line of the domain must be written with a "Header", that is, there must be no blank characters (spaces and tabs) on the left side. To continue a line, you must start with a blank character, the first blank character is not inherent in the information.
The body contains the Content of the email. Its Type is indicated by the Content-Type field in the email header. The most common types are text/plain (plain text) and text/html (hypertext ). The body is divided into multiple segments, each of which contains two parts: the header and the body. These two parts are also separated by blank lines. There are three common multipart types: multipart/mixed, multipart/related, and multipart/alternative. From their names, it is not difficult to deduce the meaning and usefulness of these types.
To add attachments to an email, you must define multipart/mixed segments. If embedded resources exist, at least multipart/related segments must be defined. If plain text and hypertext coexist, define at least multipart/alternative segments. To generate an email, you must generate the MIME parts. The email module is encapsulated for these operations. Let's take a look at the generation method:

#-*-Encoding: gb2312-*-import emailimport string, sys, OS, emailimport timeclass MailCreator: def _ init _ (self): # create the message object self of the email. msg = email. message. message () self. mail = "" def create (self, mailheader, maildata, mailattachlist = []): # mailheader is of the dict type, maildata is of the list type, and the first item in it is of the plain text type, the second item is html. # mailattachlist is a list with the attachment file name if not mailheader or not maildata: return for k in mailheader. keys (): # Subject must be specially processed, and Chinese must be converted. # For example, "One of my test emails" must be converted to =? Gb2312? B? ZtK1xNK7uPay4srU08q8/g =? = If k = 'subobject': self. msg [k] = email. header. header (mailheader [k], 'gb2312') else: self. msg [k] = mailheader [k] # create a plain text part body_plain = email. MIMEText. MIMEText (maildata [0], _ subtype = 'plain ', _ charset = 'gb2312') body_html = None # create the html part. This is optional if maildata [1]: body_html = email. MIMEText. MIMEText (maildata [1], _ subtype = 'html', _ charset = 'gb2312') # create a multipart and append the preceding text and html parts to it, as for why, you can check the mime-related content attach = email. MIMEMultipart. MIMEMultipart () attach. attach (body_plain) if body_html: attach. attach (body_html) # process each attachment for fname in mailattachlist: attachment = email. MIMEText. MIMEText (email. encoders. _ bencode (open (fname, 'rb '). read () # Set the file type here, all of which are set to Application. of course, it can also be an Image or Audio or something. No matter how many attachment here. replace_header ('content-type', 'application/octet-stream; name = "'+ OS. path. basename (fname) + '"') # You must set the Transmission Encoding to base64, because base64 attachment is used by default. replace_header ('content-Transfer-Encoding ', 'base64') attachment. add_header ('content-disposition', 'attachment; filename = "'+ OS. path. basename (fname) + '"') attach. attach (attachment) # generate the final mail self. mail = self. msg. as_string () [:-1] + attach. as_string () return self. mailif _ name _ = '_ main _': mc = MailCreator () header = {'from': 'zhaowei @ 163.com ', 'to ': 'weizhao @ 163.com ', 'subobject': 'One of my test mails'} data = ['plain text information ', '<font color = "red"> html text information </font>'] if sys. platform = 'win32 ': attach = ['C:/windows/clock. avi'] else: attach = ['/bin/cp'] mail = mc. create (header, data, attach) f = open ("test. eml "," wb ") f. write (mail) f. close ()

Here I encapsulate a class for processing. The general process is:
1. Create a message object: email. Message. Message ()
2. Create a MIMEMultipart object: email. MIMEMultipart. MIMEMultipart ()
3. create various MIMEText objects and attach them to MIMEMultipart. Here, MIMEText is not only text, but also image, application, and audio.
4. Generate the final email.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.