"Turn" python Learning (-python) character encoding

Source: Internet
Author: User
Tags chr

Transfer from http://www.cnblogs.com/BeginMan/p/3166363.html

One, the difference between ASCII, Unicode, and UTF-8 in character encoding

Click to read: http://www.cnblogs.com/kingstarspe/p/ASCII.html

Another related blog post: http://www.cnblogs.com/huxi/archive/2010/12/05/1897271.html

Second, Unicode and ASCII

Python can handle Unicode and ASCII encoding, and to make the two look as similar as possible, the Python string has changed from the original simple type to the real object. The ASCII string becomes the StringType, and the Unicode string becomes unicodetype. Use the following:

"Hello World"    #ASCII string'Hello World' >>> u'Hello World# Unicode stringu'Hello              World 

1, str (), Chr () can only use 0~255 as a parameter, that is to say, only processing ASCII strings. If there is a Unicode string, it is automatically converted to ASCII and then passed in to these functions.

Reason: Unicode supports the word than characters, and an exception occurs if there are characters in Str (), Chr () that do not exist in ASCII.

2, Unicode (), Unichar () can be considered as Unicode versions of STR () and Chr ().

>>> Unicode ('Hello World') u'Hello World'      

Third, coding and decoding

The problem they solve is coding (encode ()), decoding (decode ()), and not garbled.

Codec represents the encoding method.

"""Writes a Unicode string to a disk file, and then reads it out and displays it;Write the time with UTF-8, read also use UTF-8."""CODEC =‘Utf-8‘FILE =‘Demo.txt‘Strin = u‘Beginman'll be a great coder'Byte_strin = Strin.  Encode(CODEC) # encoded with uft-8 f = open (file,'w') f.write (Byte_strin) f.close () F = open (file, 'r') str = F.read () f.close () str_out = str.decode(CODEC) # decode with Utf-8 print str_out # output: Beginman'll be a great coder         

Attention:

1, the program in the occurrence of strings must be preceded by prefix u

' Blog Park cnblog'  # Don't write like this, so easily garbled as: Å argon å›žå¤ 瑿 nblogs = u' Blog Park cnblog'# right   

2, do not use the STR () function, try to use Unicode () instead

3. Do not use outdated string modules

4, there is no need to encode or decode Unicode strings in the program, encoding and decoding is generally used to manipulate files, databases, networks and so on.

5. String formatting

>>>‘%s%s‘ %(‘Begin‘,‘Mans‘)‘Begin Mans‘#Remember the last time the blog about strings said: "Ordinary strings and Unicode strings can be converted to Unicode strings" >>> u‘%s%s'% (U‘Begin', u‘Mans‘) u‘Begin Mans' >>> u‘%s%s‘ %(‘Begin‘,‘Mans‘) u‘Begin Mans' >>>‘%s%s "% (U" begin",  '  Man ' ) u "begin man" >>>  "%s%s"% ( '  Begin ", U" man ' ) u '  Begin man          

Turn Python Learning (-python) character encoding

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.