Python2
There are two types of strings in Python: Str and unicode,str are not exactly strings, they are actually byte strings made up of Unicode encoded (encode) bytes, and Unicode is a real string, consisting of characters.
Python uses two bytes internally to store a Unicode, and the advantage of using Unicode objects instead of STR is that Unicode facilitates cross-platform.
There are two ways to define a Unicode:
u1 = U ' Hello ' u2 = Unicode (' Hello ', ' utf-8 ')
The Str.decode method and the Unicode.encode method are most commonly used:
To understand whether to deal with STR or Unicode, use the pair processing method (Str.decode/unicode.encode)
The default encoding for Python2 (the default character set) is asscii
Attention
STR can also encode (encode), Unicode can decode (decode) Unicode decoding is of little significance
STR encoding usually causes an error:
Code error message: Unicodedecodeerror: ' ASCII ' codec can ' t decode byte 0xe4 in position 0:ordinal not in range (128)
Because defaultencoding is not specified, the fact that S.encode (' Utf-8 ') is present in an implicit decode:
Default encoding: ASCII is the cause of many errors, so you can set the default encoding (defaultencoding) in your code:
Python3
The default encoding for Python3 (the default character set) is utf-8:
Python3 uses the STR type to encode the string by default, using the bytes operation binary data stream by default, encoding using ENOCDE, decoding using decode
Comparison of string differences between Python2 and Python3
|
BYTE code |
String |
Default encoding |
Python2 |
Str |
Unicode |
Ascii |
Python3 |
bytes |
Str |
Utf-8 |
Python2 differs from string in Python3