Here we use chardet, a character encoding module, which can be directly installed using esay_install.
OS environment win7 ide: wingide
1. Use the default environment Encoding
'''View system encoding ''' import sys print 'System encoding: ', sys. getdefaultencoding () ''' view string encoding ''' import chardets = 'Hello s' print S print chardet. detect (s)
The output is
System encoding: ASCII Hello s {'confidence ': 0.99, 'encoding': 'gb2312 '}
Here we can see that the system code is ASCII, And the IDE code is gb2312, so the display is normal.
2 Add the header file # Coding = utf8
# Coding = utf8''' view system encoding ''' import sys print 'System encoding: ', sys. getdefaultencoding () ''' view string encoding ''' import chardets = 'Hello s' print S print chardet. detect (s)
Output
System encoding: ASCII audio codecs {'confidence ': 0.7525, 'encoding': 'utf-8 '}
At this time, the output is garbled, that is, the characters have already become UTF-8 encoded, but the output is garbled. I suspect it is an IDE output encoding problem, but I cannot find the option to change the IDE output character encoding.
Using the command line, this file is still output, that is, the ASCII code used for output.
3. Decode s in the output.
# Coding = utf8''' view system encoding ''' import sys print 'System encoding: ', sys. getdefaultencoding () ''' view string encoding ''' import chardets = 'Hello s' print S print chardet. detect (s) Ss = S. decode ('utf-8') # utf8 decodes print SS print chardet. detect (SS)
Output
System encoding: ASCII audio codecs {'confidence ': 0.7525, 'encoding': 'utf-8'} Hello SD: \ devsofts \ python2.7 \ Lib \ Site-packages \ chardet-2.1.1-py2.7.egg \ chardet \ universaldetector. PY: 90: unicodewarning: Unicode equal comparison failed to convert both arguments to Unicode-interpreting them as being unequal {'confidence ': 1.0, 'encoding': 'ascii '}
It is strange that the code displayed after UTF-8 decoding is normal, but the final encoding is changed to ASCII ?? A lot of questions
..