Python Chinese Code (i)

Source: Internet
Author: User

I am learning the process of Python, the second problem encountered, is the Chinese garbled, now is barely getting started, here to tell you about my experience, but also a new guide.

In the article, I will focus on a concept: there is to go. Where does the data come from and where to go? Chinese in ====================================================1, Windows CMD Terminal C:\Documents and Settings\admin>python python 2.7.< Span class= "lit" >7 (default jun 12014, 14: 17:13) [msc v. 1500 32 bit (intel)] on win 32 Type "help", "copyright", "credits" or "license" for more information. >>> s = ‘我是中文‘ >>> ss = u‘我真的是中文‘ >>> s ‘\xce\xd2\xca\xc7\xd6\xd0\xce\xc4‘ >>> ss u‘\u6211\u771f\u7684\u662f\u4e2d\u6587‘ >>> print s 我是中文 >>> print ss 我真的是中文 >>>In this case, the input and output will not be garbled, even if our string added U.    1) Where does the input come from?    Terminal    2) What is the input encoding?    One does not know, one is Unicode    3) What is the output encoding?    Do not know 2., execute PY file in Windows cmdLet's take a look at the code test.py #coding:utf-8 s = ‘abc我是中文字符串‘ ss = u‘我也是中文字符串‘ print s print repr(s) print ss print repr(ss)The file is in the form of a UTF8 withour BOM (we'll discuss the file encoding later). We perform a look at the cmd terminal D:\code>python test.py abc鎴戞槸涓枃瀛楃涓 ‘abc\xe6\x88\x91\xe6\x98\xaf\xe4\xb8\xad\xe6\x96\x87\xe5\xad\x97\xe7\xac\xa6\xe4\xb8\xb2‘ 我也是中文字符串 u‘\u6211\u4e5f\u662f\u4e2d\u6587\u5b57\u7b26\u4e32‘ D:\code>God, how there will be garbled, how can have garbled!! I'm going crazy. Stop, don't go crazy, let's step into the analysis:1) Where does the input come from?     Crap, from the file.
2) What is the input encoding?     This, like one is UTF8, one is Unicode
3) What is the output encoding?     Don't know hey, not utf8?
did you see a little bit of a problem? UTF8------> Output encoding---------> garbledUnicode------>Output encoding---------> does not appear garbled then, before the text output, we first converted to Unicode, and then output, is not there is no garbled it? Let's give it a try. #coding:utf-8 s = ‘abc我是中文字符串‘ ss = u‘我也是中文字符串‘ print s print repr(s) # 其它字符串解码成unicode uu = s.decode(‘utf-8‘) print uu print repr(uu) print ss print repr(ss)Take a look at the results D:\code>python test.py abc鎴戞槸涓枃瀛楃涓 ‘abc\xe6\x88\x91\xe6\x98\xaf\xe4\xb8\xad\xe6\x96\x87\xe5\xad\x97\xe7\xac\xa6\xe4\xb8\xb2‘ abc我是中文字符串 u‘abc\u6211\u662f\u4e2d\u6587\u5b57\u7b26\u4e32‘ 我也是中文字符串 u‘\u6211\u4e5f\u662f\u4e2d\u6587\u5b57\u7b26\u4e32‘ D:\code>Sure enough, no garbled. It seems to have finally solved a problem, but it's not enough because we may have other problems. 3. Interacting with users in Windows cmd in order to cope with various environments, our code will encounter a variety of problems? For example, the written code may be executed in CMD, may be executed in idle, may also be executed under Linux, we have to as far as possible control program according to our will to work, the first is not garbled.If we now have a code file that requires the user's input, we execute it in cmd, we need to know a bit clearly, what is the encoding we entered? only know what the input encoding is? Can be decoded to Unicode so that no garbled characters are present.
So, in Cmd, what is the input encoding?
before that, let's learn the decode and the encode.1) Decode decoding, in the case of a known string encoding, transcoding to Unicode, such as S.decode (' Utf-8 '), the result is Unicode
2) Encode encoding, in the case of existing Unicode, transcoding to other codes, such as U.encode (' Utf-8 '), the result is Utf-8
 I just have to say a little bit, you know. sys.stdin.encodingOf course, the corresponding is sys.stdout.encodingOr look at the code: #coding:utf-8 import sys s = raw_input() print s print repr(s) u = s.decode(sys.stdin.encoding) print u print repr(u) o = u.encode(sys.stdout.encoding) print o print repr(o)Run in cmd D:\code>python test.py 我是中文 我是中文 ‘\xce\xd2\xca\xc7\xd6\xd0\xce\xc4‘ 我是中文 u‘\u6211\u662f\u4e2d\u6587‘ 我是中文 ‘\xce\xd2\xca\xc7\xd6\xd0\xce\xc4‘ D:\code>Run in Idle python 2.7.< Span class= "lit" >7 (default jun 12014, 14: 17:13) [msc v. 1500 32 bit (intel)] on Win32 Type "copyright", "credits" or "license()" for more information. >>> ================================ RESTART ================================ >>> 我是中文 我是中文 ‘\xce\xd2\xca\xc7\xd6\xd0\xce\xc4‘ 我是中文 u‘\u6211\u662f\u4e2d\u6587‘ 我是中文 ‘\xce\xd2\xca\xc7\xd6\xd0\xce\xc4‘ >>>Running in Linux [email protected]:~/Desktop# python test.py 我是中文 我是中文 ‘\xe6\x88\x91\xe6\x98\xaf\xe4\xb8\xad\xe6\x96\x87‘ 我是中文 u‘\u6211\u662f\u4e2d\u6587‘ 我是中文 ‘\xe6\x88\x91\xe6\x98\xaf\xe4\xb8\xad\xe6\x96\x87‘Summary: If you know where you are coming from and where you are going, you will be sure to get to that place.

Python Chinese Code (i)

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.