Python has both Str object and Unicode object two strings that can hold the byte encoding of the characters, but they are different types, which is important and why there are encode and decode.
The significance of encode and decode in Pyhton can be expressed as
Encode
Unicode-------------------------> str
Unicode <--------------------------str
Decode
Several common methods:
Str_string.decode (' codec ') is to convert str_string to unicode_string, codec is the encoding of the source str_string
Unicode_string.encode (' codec ') is the encoding method of converting unicode_string to Str_string,codec, which is the target str_string
Str_string.decode (' From_codec '). Encode (' To_codec ') enables conversion between str_string of different encodings
Like what:
>>>T='Great Wall'
>>>T
'\xb3\xa4\xb3\xc7'
>>>T.decode ('gb2312'). Encode ('Utf-8')
'\xe9\x95\xbf\xe5\x9f\x8e'
Str_string.encode (' codec ') is the first call to the system's default codec to convert Str_string to unicode_string, and then encode to convert to the final codec. Equivalent to Str_string.decode (' Sys_codec '). Encode (' codec ').
Unicode_string.decode (' codec ') is basically meaningless, Unicode only uses a Unicode encoding in Python, UTF16 or UTF32 (which is already determined when compiling Python), without the need for transcoding.
Note: The default codec is specified in the sitecustomize.py file under Site-packages, such as
Import SYS
Sys.setdefaultencoding ('utf-8')
Python transcoding tips