A detailed explanation of Chinese Python Problems

Last Update:2013-12-17 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Before explaining the Python Chinese issue, let's first talk about What Python is, and there was a strong interest in Python in the past. Who knows this old friend once again has an unexpected problem with Chinese? In the code, the problem of Chinese Python is always bothering us ..

It is no wonder that we are not the Chinese who invented computers. Otherwise, computers all over the world now support and must support GBK. I am not the one who writes this article, but a kingfa programmer on the other side of the ocean, and the title is changed to studying the english problem in 'python '".

Let's face the real problems. Compared with java, the performance of Chinese problems in Python is more intense. "Fierce" means not to say it is more serious or difficult to solve. Only Python uses strict by default for decode and encode errors, that is, an error is reported directly, while java uses replace to handle them, therefore, a lot "?? ".

In addition, Python's default encoding is ASCII, while java's default encoding is consistent with the operating system's encoding. At this point, I think java is more reasonable. This is more friendly to programmers and reduces the frustration of newbies at the beginning, which is conducive to language promotion.

However, Python also has its own principle. After all, ASCII is the only character set supported by all platforms in the world, and the problem always occurs, it is better to face it earlier than to escape it. Okay. Now, let's talk about the symptoms of Chinese problems in Python. Before that, we should first understand that Python has two types of strings, each of which is a general string and each character is represented by 8 bits) and the Unicode string is represented by one or more bytes ).

They can be converted to each other and have a more comprehensive description. I will not talk about them here. Let's look at the following code:

 
 
  
  #-*-Coding: gb2312-*-# It must be in the first or second line.
  
  Print "------------- code 1 ----------------"
  
  A="A. I love you" 
  
  Print
  
  Print a. find ("I ")
  
  B=A. Replace ("love", "like ")
  
  Print B
  
  Print "-------------- code 2 ----------------"
  
  X="A. I love you" 
  
  Y=Unicode(X, "gb2312 ")
  
  Print y. encode ("gb2312 ")
  
  Print y. find (u "I ")
  
  Z=Y. Replace (u "", u "")
  
  Print z. encode ("gb2312 ")
  
  Print "--------------- code 3 ----------------"
  
  Print y

It is a non-ASCII character, and let's refer to pep-0263. PEP-0263Python Enhancement Proposal) The above is very clear, Python is aware of the international problem, and proposed a solution. According to the requirements above, we have the following code:

 
 
  
  ------------- Code 1 ----------------
  
  A. I love you
  
  5
  
  A. I like you
  
  -------------- Code 2 ----------------
  
  A. I love you
  
  3
  
  A. I like you
  
  --------------- Code 3 ----------------
  
  Traceback (most recent call last ):
  
  File "G: \ Downloads \ eclipse \ workspace \ p \ src \ hello. py", line 16, in<Module> 
  
  Print y
  
  UnicodeEncodeError: 'ascii 'codec can't encode characters in position 0-1: ordinal not in range (128)

We can see that by introducing the Python Chinese statement, we can normally use Chinese, and in code 1 and 2, the console can correctly print Chinese. However, it is obvious that the above Code also reflects many problems:
1. code 1 and 2 use different print methods. 1 is direct print, and 2 is encoded Before print.
2. In code 1 and 2, find the same character "I" in the same string and the results are different: 5 and 3)
3. An error occurs when unicode string y is directly printed in code 3. This is why code 2 must be encoded first)

Introduction to Python system files
How to correctly use Python Functions
Detailed introduction and analysis of Python build tools
Advantages of Python in PythonAndroid
How to Use the Python module to parse the configuration file?

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

A detailed explanation of Chinese Python Problems

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

A detailed explanation of Chinese Python Problems

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support