1:python and Unicode
To handle multilingual text correctly, Python introduced a Unicode string after version 2.0.
Print in 2:python
While Python internally needs to convert text encoding to Unicode encoding, the terminal display work is done by a traditional Python string (in fact, the Python print statement cannot print double-byte Unicode characters at all).
Python print automatically encodes the output's Unicode encoding (for other non-Unicode encodings, print output as is), while the file object's write method does not, so when some strings are printed with print, Write to a file is not necessarily the same as print.
In Linux is based on the environment variable to convert, the use of Linux under the locale command can be seen. The print statement realizes that the output will be delivered to the operating system, and the operating system encodes the input byte stream according to the system's encoding.
>>>str= ' Learning python '
>>> Str
' \xe5\xad\xa6\xe4\xb9\xa0python ' #asII编码
>>> Print str
Learn Python
>>> str=u ' Learning python '
>>> Str ### #unicode编码
' \xe5u\xad\xa6\xe4\xb9\xa0python '
the decode in the 3:python
Converts other character sets to Unicode encoding (only Chinese characters need to be converted)
>>> str= ' learning '
>>> ustr=str.decode (' Utf-8 ')
>>> USTR
U ' \u5b66\u4e60 '
In this way, the Chinese characters are encoded, can be followed by Python processing, (if not converted, Python will be based on the machine's environment variables for the default encoding conversion, which may appear garbled)
the encode in the 4:python
Convert Unicode to another character set
>>> str= ' learning '
>>> ustr=str.decode (' Utf-8 ')
>>> USTR
U ' \u5b66\u4e60 '
>>> ustr.encode (' Utf-8 ')
' \xe5\xad\xa6\xe4\xb9\xa0 '
>>> print Ustr.encode (' Utf-8 ')
Learn