The default encoding for 1,python2 is the ASCII code.
There are 2 data models in 2,python2 that support string data types, str and Unicode, respectively.
3,uncode conversion to other encodings is encode, and other encodings are converted to Unicode decode (decoding). So Unicode is the core, such as you now have a GBK string, if you want to become utf-8, then you need to decode and then encode.
4, the encoding of the Declaration at the beginning of the file is related to the definition of str. STR has utf-8 GBK gb2312 Ascaii and so on.
Like what:
#!/usr/bin/env python# *-*coding:utf-8 *-*s = ' China ' Print (type (s)) results: <type ' str ' >
It can be found that s is a string, but in fact its encoding is utf-8, because the declaration variable at the beginning is utf-8.
#!/usr/bin/env python# *-*coding:utf-8 *-*s = ' China ' Print (type (s)) data = S.decode (' utf-8 ') print (data) results: <type ' str ' > China <type ' Unicode ' >
S.decode (' Utf-8 ') can be found to decode s to Unicode, at which point data can be encoded into other formats.
Like what:
#!/usr/bin/env python# *-*coding:utf-8 *-*s = ' China ' Print (type (s)) S_unicode = S.decode (' utf-8 ') S_GBK = S_unicode.encode (' GBK ')
The above results will output a GBK encoded string, but may show garbled characters. This depends on your terminal. If you are using the Windows CMD window, the default is GBK, it will be displayed, but if you are using a Linux terminal or Pycharm run will be garbled.
5, as mentioned above, Python2 is using ASCII code as the default encoding, so there is a problem. As follows:
This is puzzled, I just clearly is the code, why will show decoding it? Even if the decoding is ASCII code it? This is related to the Python2 default encoding.
Because python2 defaults to decoding me with the default ASCII code when I encode it, so
The S.encode (' utf-8 ') process is S.decode (' ASCII '). Encode (' Utf-8 '), while S has no way to decode to Unicode. Because it is essentially utf-8, so this can not be decoded, error.
This is the embarrassing point of the default encoding.
6, file operation
Python2 operation files, will often error ... That's because we don't understand. So, let's talk about your superficial thoughts.
Operation of the file, it is recommended to use codecs this module, very convenient. Codecs provides an open method, and the open () method can specify the encoding format.
using this method to open this file, the read return is Unicode. When writing, if the write parameter is Unicode. is written using the encoding of the open file, and if it is STR, it is first decoded to Unicode using the default encoding and then the encoding of the open file is written
It is important to note that if STR is in Chinese and the default encoding sys.getdefaultencoding () is ASCII, the decoding error will be reported.
From the above can be found by default open file, it will automatically encode, if not specify the encoding, this time he used to use the default encoding, so the process is S.encode (' ASCII ') so this is not an error?
So when you write, you specify the encoding. thus
This will prevent the error.
Below is the read, which can be found read, is Unicode encoded. File stream. Decode (' Utf-8 ')
The above as your own notes, there may be errors oh.
Python2 Coding Problems