Learning python to handle python coding problems

Source: Internet
Author: User

In summary, unicode characters can be processed from python1.6.
I. Several Common encoding formats.
1.1, ascii, expressed in 1 byte.
1.2, UTF-8, expressed in 1 to 3 bytes, represents the ascii code occupies only 1 byte, ascii encoding is a subset of the UTF-8.
1.3, UTF-16, expressed in 2 bytes, in python, unicode meaning is UTF-16.
Ii. encoding and decoding of python source files. The process from generation to execution of the python program we write is as follows:
Editor ----> source code ----> interpreter ----> output result
2.1 The editor determines the encoding format of the source code (set in the editor)
2.2. the interpreter must also know the encoding format of the source code (unfortunately, it is difficult to know the encoding format of the source file from the encoding data)
2.3, supplement: In Windows when using UltraEdit to save the source code into a UTF-8, will record the BOM mark in the file (do not study) So ActivePython interpreter will automatically recognize the source file is in UTF-8 format, but if you use eclipse to edit the source file, although the file is encoded as a UTF-8 in the editor, but because it is not recorded in the BOM flag, you must add # coding = UTF-8 at the beginning of the source file, it is interesting to use annotations to indicate the encoding method of the interpreter source file.
2.4. Example: for example, we want to output "I am Chinese" to the terminal ". Copy codeThe Code is as follows: # coding = UTF-8 indicates that the python interpreter uses UTF-8 encoding. I use eclipse + pydev.
Print "I am Chinese" # The source file itself also needs to be saved as UTF-8 Encoding

Three, the conversion of encoding, the conversion of the two types of encoding must use UTF-16 as a transfer station.
For example, if there is a Japanese file, which contains the content "は 中 す. ", The encoding format is Japanese-encoded SHIFT_JIS,
There is also a chn.txt file in the format of "People's Republic of China", which is a Chinese encoding GB2312.
How can we merge the content of the two files together and store them to utf.txt without displaying garbled characters? We can convert the content of the two files into a UTF-8 format, because the UTF-8 contains Chinese encoding and Japanese encoding.Copy codeThe Code is as follows: # coding = UTF-8
Try:
JAP = open ("e:/jap.txt", "r ")
CHN = open ("e:/chn.txt", "r ")
UTF = open ("e:/utf.txt", "w ")
Jap_text = JAP. readline ()
Chn_text = CHN. readline ()
# Decode into a UTF-16, then encode into a UTF-8
Japan _ text_utf8 = Japan _ text.decode ("SHIFT_JIS"). encode ("UTF-8") # Do not convert to UTF-8 can also
Chn_text_utf8 = chn_text.decode ("GB2312"). encode ("UTF-8") # The encoding method is case-sensitive, and the same is true for UTF-8.
UTF. write (jap_text_utf8)
UTF. write (chn_text_utf8)
Handle t IOError, e:
Print "open file error", e

IV. The Tk Library supports ascii, UTF-16, UTF-8
Copy codeThe Code is as follows: # coding = UTF-8
From Tkinter import *
Try:
JAP = open ("e:/jap.txt", "r ")
Str1 = JAP. readline ()
Handle t IOError, e:
Print "open file error", e
Root = Tk ()
Label1 = Label (root, text = str1.decode ("SHIFT_JIS") # garbled characters are displayed if no decode exists.
Label1.grid ()
Root. mainloop ()

The above is the basic process of learning python to process python encoding. I hope it will be helpful to you.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.