2.7:1. Unicode unifies the language, two bytes commonly used to represent a character eg: when reading the contents of Notepad, the computer first-Convert 8 characters to Unicode to memory, and then convert Unicode to utf-when saved8 Save to file. EG2: When browsing a Web page, the server converts Unicode to UTF-8 re-transfer to the browser2.Unicode encoded into Utf-8, a Chinese Unicode character becomes three utf-8 characters, one Chinese Unicode character becomes two GBK characters>>> u'ABC'. Encode ('Utf-8')'ABC'>>> u'English'. Encode ('Utf-8')'\xe4\xb8\xad\xe6\x96\x87'>>> Len (U'ABC')3>>> Len ('ABC')3>>> Len (U'English')2>>> Len ('\xe4\xb8\xad\xe6\x96\x87')6>>> u'English'. Encode ('gb2312')'\xd6\xd0\xce\xc4'in turn, the UTF-8 encoded string'XXX'Convert to Unicode string u'XXX'With Decode ('Utf-8') Method:>>>'ABC'. Decode ('Utf-8') U'ABC'>>>'\xe4\xb8\xad\xe6\x96\x87'. Decode ('Utf-8') U'\u4e2d\u6587'>>>Print '\xe4\xb8\xad\xe6\x96\x87'. Decode ('Utf-8') Chinese3. In Python version 3, strings are Unicode-encoded
2 Version Example:>>> Ord ('in') Traceback (most recent): File"<pyshell#1>", Line 1,inch<module>Ord ('in') Typeerror:ord () expected a character, but string of length2found
3 Version Example:>>> Ord ('in')20013Python3 in Unicode, STR can be encoded as a specified bytes by means of the encode () method, for example:>>>'ABC'. Encode ('ASCII') b'ABC'>>>'English'. Encode ('Utf-8') b'\xe4\xb8\xad\xe6\x96\x87'no need to add u before Chinese
Python Code (1)