Introduction to base64 encoding principles and use of the python base64 Module

Source: Internet
Author: User
Tags mail account

Base64 encoding. First, we should clarify why there is a 64 character in it? In fact, because the encoding uses 64 plain text to encode arbitrary binary files, it only uses the A-Z, A-Z, 0-9, +,/This 64 characters, some people who have "a little understanding" will say that there is a "=" in it. It's good, but the equal sign is not a character encoded, but a character filled.

Also, why did we invent such an encoding? In fact, the encoding principle is very simple, and "cracking" is also very easy. When an email is just sent, only English characters are passed, this is no problem, but later, the Chinese and Japanese will send emails. This is the problem because these characters may be processed by the email server or gateway as commands, therefore, there must be an encoding to encrypt the email, but the purpose of the encryption is to make some original servers out of the problem (the new fork server can already handle these messy situations, however, because a set of specifications has been formed, the email must be base64 encoded before it can be transmitted. In this way, the encryption must be simple (that is, reverse encryption, or encryption ,:-), still did not solve the fundamental problem), the encryption is simple, so that the client program can encrypt and decrypt quickly, and if it is plain text ASCII encoding, then base64 will be born. At the beginning, the designer mainly considered two issues:

1. complexity and efficiency of encryption algorithms
2. How to Handle Transmission

Base64 can basically meet the requirements. If a mail account for 100% of the CPU or the memory is used up, it is completely unnecessary. After encoding, it is enough for ordinary people to see the content at a glance.

Next we will talk about the base64 encoding principle. According to rfc2045, the base64 content-transfer-encoding is designed to represent arbitrary sequences of octets in a form that need not be humanly readable. when I checked base64 information online, I found a good website,
Base64 encoding online converter: bytes. Base64 encoding requires that three 8-bit bytes (3*8 = 24) be converted into four 6-bit bytes (4*6 = 24 ), then add two zeros before the six bits to form the 8-bit one-byte format. In the 6bit2 hexadecimal notation, the maximum number is 6 2 2 2 64 64 64 64 64 64 64 64 64 64 2 2 2:

The base64 alphabet

Value encoding value Encoding
0 A 17 R 34 I 51 Z
1 B 18 S 35 J 52 0
2 C 19 t 36 K 53 1
3 D 20 u 37 L 54 2
4 E 21 V 38 m 55 3
5 F 22 W 39 n 56 4
6g 23x40 o 57 5
7 H 24 y 41 P 58 6
8 I 25 Z 42 Q 59 7
9 J 26 A 43 R 60 8
10 K 27 B 44 s 61 9
11 l 28 C 45 t 62 +
12 m 29 D 46 U 63/
13 N 30 E 47 v
14 O 31 F 48 W (PAD) =
15 p 32G 49 x
16 Q 33 H 50 y

Encoding principle: Convert three bytes into four bytes (3x8) = 24 = (4x6). First, read three bytes, each read one byte, shifts 8 bits to the left, and then shifts four times to the right, with 6 bits each time. In this way, there are 4 bytes.
Decoding principle: convert four bytes into three bytes. First, read four 6 bits (using or), move six bits left at a time, and then shift three bits right at a time, eight bits at a time, in this way, it is restored.

Base64 converts three bytes into four bytes. Therefore, the amount of code after encoding (in bytes, the same below) is about 1/3 more than the amount of code Before encoding. If the amount of code is exactly an integer multiple of 3, it is much better than 1/3. But what if not? At this time, "=" is used. When the code volume is not an integer multiple of 3, the remainder of the Code volume/3 is naturally 2 or 1. During conversion, if the result is less than six digits, 0 is used to fill in the corresponding position, and then two zeros are added before the six digits. After the empty output is converted, "=" is used to fill in the bit. In short, make sure that the number of bytes finally encoded is a multiple of 4.

The principle is also described in the table. You can encode it ......

Let's talk about Python's support for base64. There is a base64 module dedicated to this. Let's just look at it and paste the example code directly.

#! /Usr/bin/ENV Python #-*-coding: UTF-8-*-# utility @ Python # functions: base64 Codec Module # Created by magictong on 2008-07-16import base64import osimport stringio # encode, decode, encodestring, decodestring, b64encode, b64decode, urlsafe_b64decode, urlsafe_b64encode class kbase64: "" base64 encoding and decoding module, used for file, string, simple encapsulation of base64 by URL encoding/decoding "def _ init _ (Self): Pass def encodefile (self, strfilename, strdecname ):" "" Encode the content of a file as base64 "if not OS. path. exists (strfilename): Return false F1 = none F2 = none try: F1 = open (strfilename, "R") F2 = open (strdecname, "W") base64.encode (F1, f2) failed t exception, E: Print e if F1! = None: f1.close () If F1! = None: f2.close () return false f1.close () f2.close () return true def decodefile (self, strfilename, strdecname ): "decodes the content of a base64 file" if not OS. path. exists (strfilename): Return false F1 = none F2 = none try: F1 = open (strfilename, "R") F2 = open (strdecname, "W") base64.decode (F1, f2) failed t exception, E: Print e if F1! = None: f1.close () If F1! = None: f2.close () return false f1.close () f2.close () return true def encodesting (self, strsrc): "base64 encoding of strings" "try: strdec = base64.encodestring (strsrc) Except t exception, E: Print e return "", false return strdec, true def decodesting (self, strsrc ): "decodes base64 strings as source strings" try: strdec = base64.decodestring (strsrc) cannot exception, E: Print e return "", false return strdec, true if _ name _ = "_ main _": baseobj = kbase64 () print baseobj. encodesting ("")

Python also has an email module dedicated to email encoding and decoding, which is more convenient. Let's take a look at the source file of an email:

Received: From dbmail.kingsoft.com (192.168.8.252) by mail.kingsoft.cn
(192.168.13.1) with Microsoft SMTP server ID 8.1.240.5; Mon, 3 Mar 2008
10:52:20 + 0800
Received: from unknown (Helo linyehui) (linyehui@219.131.196.66) by localhost
With SMTP; 3 Mar 2008 03:08:58-0000
From: =? Gb2312? B? Wdbstrvu? = <Linyehui@kingsoft.com>
To: =? Gb2312? B? Vg9uz0xlasbbza/a2l0 =? = <TongLei@kingsoft.com>,
=? Gb2312? B? Thvib25newfuzybbwr266dh0xq =? = <LuHongyang@kingsoft.com>,
=? Gb2312? B? Thvzaxdhbmcgw8ks0ubn + l0 =? = <LuYiwang@kingsoft.com>,
=? Gb2312? B? Tgluwwvodwkgw8hw0ra71f0 =? = <LinYehui@kingsoft.com>,
=? Gb2312? B? Tglkawfuifva7r2jxq =? = <LiJian@kingsoft.com>,
=? Gb2312? B? Tglezwhvbmcgw8dutck66l0 =? = <LiDehong@kingsoft.com>,
=? Gb2312? B? Smlhbmdxyw5nc2hlbmcgw72qzfrj + l0 =? = <JiangWangsheng@kingsoft.com>,
=? Gb2312? B? R2vuz1poyw9ozsbbualv17ryxq =? = <GengZhaohe@kingsoft.com>,
=? Gb2312? B? Rgvuz1blbmcgw7xlxfrd? = <DengPeng@kingsoft.com>,
=? Gb2312? B? Q2hlbmdidwkgw7pmu9rd? = <Chenghui@kingsoft.com>,
=? Gb2312? B? Q2hlblpoaxfpyw5nifuzwta + x79d? = <ChenZhiqiang@kingsoft.com>
Date: Mon, 3 Mar 2008 10:50:54 + 0800
Subject: =? Gb2312? B? W7vh0um8x8k8xtiwmdgwmzaz1ty1/lt6u + ff0n7v/bdmsb4 =? =
Thread-topic: =? Gb2312? B? W7vh0um8x8k8xtiwmdgwmzaz1ty1/lt6u + ff0n7v/bdmsb4 =? =
Thread-index: ach82zpibkzrq2yfql2va1ymdd4t2g =
Message-ID: <200803031050475158814@kingsoft.com>
Accept-language: ZH-CN, en-US
Content-language: ZH-CN
X-MS-exchange-organization-authas: Anonymous
X-MS-exchange-organization-authsource: mail.kingsoft.cn
X-MS-has-Attach:
X-MS-exchange-organization-senderidresult: permerror
X-MS-exchange-organization-SCL: 1
X-MS-exchange-organization-PCL: 2
X-MS-exchange-organization-PRD: kingsoft.com
X-MS-TNEF-correlator:
X-scanvirus: By eqavse Antivirus Engine
X-scanresult: clean
X-mailfrom: <linyehui@kingsoft.com>
X-rcp.pdf: <tonglei@kingsoft.com>
X-fromip: 219.131.196.66
X-eqmanager-scaned: 1
X-eqauthuser: linyehui
X-forwarded ed: Unknown, 219.131.196.66, 20080303110858
Received-SPF: permerror (mail.kingsoft.cn: domain of linyehui@kingsoft.com
Used an invalid SPF mechanic)
Content-Type: text/plain; charset = "gb2312"
Content-transfer-encoding: base64
Mime-type: 1.0

YC/w3lmk1/logs
Ls0tls0tls0tls0tlq0kwdbstrvudqqjsagi1ti5ubxezsi2qlmk1/cncqoyoald3lgjobdr3txv
W8whsbcyyku1urztx7 + 5 pstcsoaxvs/gudgncqozoalhqnlgtprc6w0ko7shote8sbi907cy17dg
97w9slliq9bq0msjqmq1z9zqcm94eaopdqqzwta + x78ncqoxoak/tlt6wusncqoyoali7bz + 1f3u
2ttl0ndeo7/pdqqjs6gisllxsntl0nc1ymsjv + m1xlliytqncqo0oak199pdza/a2rxexko/6cqx
O6zo9rm5yrgz9rttdqqjtagivma7rqo6yrxp1qghtfftw21zabxeyo28/tc21ngncsduvamncqox
Oalktc/wzfjvvsjp1qtr3cq + o8kjykppdqqjsqgita +/8s7kzokjqllpyv2 + 3cyrvspby7vhta + J
Bytes
Tudk/ghost
16 qzybf + zvencqoyoaln6sn1_k3s6crvykffweuncqozoalx1mb0tq + zzndyy9hl9w0ko7shorzg
U66jurdrtku13bxedmvjdg9ysvc31rp3167vt0ld817c5/bxeyv2 + 3dtztku13agjdqra7r78dqqj
Bytes
Bytes
Xko/6b3tyosncqoyoali7bz + udza7bj3u/mxvrmmxnzn6rpjdqqjs6gisulk1l27upizwta + x78n
Cqo0oak/qrv6xvs2r727upjnr8dadqqjtagisb7w3lmk1/E7 + bg + zeqzyaosu7nt0ly4upzcvucn
Cg0kdqoncrg + collate
Bytes
+ 9kqus3n + npous/x97xeu7c7udkqsnhw2lm5soaxvrxe19s2r9eisug3/s7xtci808npdqqjsqgi
X7bn + nkztcrfweuncqozoa1+ nkzo6iworbfo6kncqo0oalx9tdfz6loxlz + tcs5pl7fdqoncg0k
Mjawoc0wmy0wmw0kwdbstrvudqo =

The decoding code is as follows:

# Coding = utf-8import emailfp = open ("magictong. eml "," R ") MSG = Email. message_from_file (FP) # create a message object in the direct file. At this time, the subject = MSG will be decoded. get ("subject") # The subject in the receiving object header, that is, the topic # The following three lines of code are only for decoding image =? GBK? Q? = Cf = e0 = C6 = ac? = Such subjecth = Email. header. header (subject) DH = Email. header. decode_header (h) Subject = DH [0] [0] print "Subject:", subjectprint "from:", email. utils. parseaddr (MSG. get ("from") [1] # Get fromprint "to:", email. utils. parseaddr (MSG. get ("to") [1] # Get toprint "Date:", email. utils. parsedate (MSG. get ("date") # The data block of each mime in the loop letter for par in MSG. walk (): If not par. is_multipart (): # It is useless to determine whether it is a multipart, Learn about mime. Name = par. get_param ("name") # if it is an attachment, the file name of the attachment will be retrieved here. If name: # There are attachments # The following three lines of code are only used to decode the image =? GBK? Q? When cf1_e01_c61_ac.rar? = The file name H = Email. header. header (name) DH = Email. header. decode_header (h) fname = DH [0] [0] print 'attachment name: ', fname DATA = par. get_payload (decode = true) # decode the attachment data and store it in the file. Try: F = open (fname, 'wb ') # Be sure to use WB to open the file, because the attachment is generally a binary file containing invalid characters in the attachment T: print, the attachment name will be changed to 'f = open ('aaa', 'wb') F. write (data) F. close () else: # It is a text content print par, not an attachment. get_payload (decode = true) # decode the text and output it directly. Print '+' * 60 # used to differentiate the output FP. Close () of each part ()
<I don't know which old man is using this code. I copied it from the Internet a long time ago. I can use it.>

Assume that the email name above is magictong. eml. The decoded content is as follows:

Subject: [meeting records] 20080303-week iteration _ revised version
From: linyehui@kingsoft.com
To: TongLei@kingsoft.com
Date: (2008, 3, 3, 10, 50, 54, 0, 1,-1)
Last week's work summary:

------------------------------------------------------

Lin yehui

1. reconstruction and stability

2. Security Protection Enhanced Function version related to the security Island

3. Migration Code

4. Prepare to connect the installer to the security center (implement proxy)

Chen Zhiqiang

1. Check the code

2. Software running Module

3. Install and run modules

4. An error occurred while calling the module of Tong lei.

5. Plan: uninstall the software that calls MSI

Li Jian

1. Implement website authentication demonstration BHO

2. Pop-up box problem (it may take too long to query data)

3. hao123 and roaming data (10 thousand duplicates are removed)

4. More than 40 thousand of Macf data is not found, and no preparations are found. The automatic reporting function is added.

Tong lei

1. Convert EXE to service

2. Improved the Protocol to receive exe

3. Self-Start Program search

4. Plan: Split the transmitted vector into unpackaged data for re-transmission.

Li Jun

1. Baidu Security Center released

2. Travel from Guangzhou

3. Plan: connect to infoc (query and report)

4. Plan: the security center is compatible with Vista

Cheng hui

1. tuotu download Module Access

2. Completion of basic software management functions

3. Submit the test to Chen Zhiqiang.

4. Start the instance and submit it to Tong lei.

5. This week's work is basically completed, and there are several bugs

 

 

 

Plan for this week:

------------------------------------------------------

1. The agent between the lyh Security Center and the installer. If you want to cooperate with online games, add the restructured version of the automatic registration service, etc.

2. Embedded Web page exe

3. webpage (Addu)

4. tools for creating information files

 

 

2008-03-03

Lin yehui

++ ++

 

O (partition _ partition) O... haha, it's very convenient.

 

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.