Learn how to process python code

Last Update:2013-12-17 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Summary, fromPythonIt can be processed from 1.6.UnicodeCharacter.

I. Several Common encoding formats.

1.1, ascii, expressed in 1 byte.

1.2, UTF-8, expressed in 1 to 3 bytes, represents the ascii code occupies only 1 byte, ascii encoding is a subset of the UTF-8.

1.3, UTF-16, expressed in 2 bytes, in python, unicode meaning is UTF-16.

Ii. encoding and decoding of python source files. The process from generation to execution of the python program we write is as follows:

Editor ----> source code ----> interpreter ----> output result

2.1. The editor determines the encoding format of the source code to be set in the editor)

2.2 It is also necessary for the interpreter to know the source code encoding format. Unfortunately, it is difficult to know the source file encoding format from the encoding data)

2.3, supplement: In Windows when using UltraEdit to save the source code into a UTF-8, will be recorded in the file BOM mark does not need to study) So ActivePython interpreter will automatically recognize the source file is in UTF-8 format, but if you use eclipse to edit the source file, although the file is encoded as a UTF-8 in the editor, but because it is not recorded in the BOM flag, you must add # coding = UTF-8 at the beginning of the source file, it is interesting to use annotations to indicate the encoding method of the interpreter source file.

2.4. Example: for example, we want to output "I am Chinese" to the terminal ".

 
 
  
  # Coding = UTF-8 tells the python interpreter to use UTF-8 encoding. I use eclipse + pydev.
  
  Print "I am Chinese" # The source file itself also needs to be saved as UTF-8 Encoding

Three, the conversion of encoding, the conversion of the two types of encoding must use UTF-16 as a transfer station.

For example, if there is a Japanese file, which contains the content "は中す. ", The encoding format is Japanese-encoded SHIFT_JIS,

There is also a chn.txt file in the format of "People's Republic of China", which is a Chinese encoding GB2312.

How can we merge the content of the two files together and store them to utf.txt without displaying garbled characters? We can convert the content of the two files into a UTF-8 format, because the UTF-8 contains Chinese encoding and Japanese encoding.

 
 
  
  # Coding = UTF-8
  
  
  
  Try:
  
  JAP = open ("e:/jap.txt", "r ")
  
  CHN = open ("e:/chn.txt", "r ")
  
  UTF = open ("e:/utf.txt", "w ")
  
  
  
  Jap_text = JAP. readline ()
  
  Chn_text = CHN. readline ()
  
  # Decode into a UTF-16, then encode into a UTF-8
  
  Japan _ text_utf8 = Japan _ text.decode ("SHIFT_JIS"). encode ("UTF-8") # Do not convert to UTF-8 can also
  
  Chn_text_utf8 = chn_text.decode ("GB2312"). encode ("UTF-8") # The encoding method is case-sensitive, and the same is true for UTF-8.
  
  UTF. write (jap_text_utf8)
  
  UTF. write (chn_text_utf8)
  
  
  
  Handle t IOError, e:
  
  Print "open file error", e

IV. The Tk Library supports ascii, UTF-16, UTF-8

 
 
  
  # Coding = UTF-8
  
  
  
  From Tkinter import *
  
  
  
  Try:
  
  JAP = open ("e:/jap.txt", "r ")
  
  Str1 = JAP. readline ()
  
  
  
  Handle t IOError, e:
  
  Print "open file error", e
  
  
  
  Root = Tk ()
  
  
  
  Label1 = Label (root, text = str1.decode ("SHIFT_JIS") # garbled characters are displayed if no decode exists.
  
  Label1.grid ()
  
  
  
  Root. mainloop ()

The above is the basic process of learning python to process python encoding. I hope it will be helpful to you.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Learn how to process python code

Contact Us

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support