Excerpt: The representation of a string inside Python is Unicode encoding, so in encoding conversion, it is usually necessary to use Unicode as the intermediate encoding, that is, to decode the other encoded characters (decode) into Unicode, and then from the Unicode encoding (encode) into a different encoding. The role of Decode is to convert other encoded characters into Unicode encodings, such as Str1,decode (' gb2312 '), to convert gb2312 encoded string str1 into Unicode encoding. The role of encode is to convert Unicode encoding into other encoded strings, such as Str2,encode (' gb2312 '), to convert Unicode encoded string str2 to gb2312 encoding. So, when transcoding, be sure to understand what the string str is encoding, then decode into Unicode encoding, and then encode into other encodings. Typically, you create a code file by using the system default encoding when you do not specify a specific encoding method. such as: s = ' Chinese ' in the UTF8 file, the string is UTF8 encoded; In the gb2312 file, the string is gb2312 encoded; If the string is defined as: S =u ' Chinese ', then the encoding of the string is specified as Unicode encoding, that is, Python's internal encoding, and the code file itself is independent of the encoding, so for this case to do the encoding conversion, simply use the Encode method to convert it to the specified encoding. If a string is already Unicode, then decoding will be an error, so it is common to determine whether it is encoded as Unicode: Isinstance (S,unicode) # to determine whether it is Unicode------->> Encode with non-Unicode encoded STR will error-How to get the system's default encoding: #!/usr/bin/env python #coding =utf-8 import sys print sys.getdefaultencoding ()
The difference between decode and encode in Python