A lot of times when you write a Python program, you add such a line of code to your head.
# Coding:utf-8
Or is that so?
# -*-coding:utf-8-*-
Wait a minute
This line of code means that the same encoding format is set to Utf-8
The data stored in the computer is encoded in various ways, such as Unicode, Utf-8, GBK, etc.
Under Windows System, the text file should be saved by default in the format GBK
When you save a file in an encoded format, you should use the same encoding to parse the file, or it may appear garbled
Today I want to record what I'm doing when I write a Python program, when I use decode when parsing string strings, when to use encode
Typically convert from non-Unicode encoding to Unicode encoding using decode (decode), instead of converting from Unicode encoding to non-Unicode encoding using encode (encoding)
# Coding:utf-8 = [' hello ']print L
Output
Now it's utf-8 encoding, a kanji account of 3 bytes
Use decode to decode and convert "Hello" encoding to Unicode
# Coding:utf-8 = [' hello ']print [L[0].decode ('utf-8')
Output
You can see that the success is converted to Unicode encoding, and a kanji account is 2 bytes
So now I want to convert utf-8 encoded "Hello" to gbk how to do it?
Try this.
# Coding:utf-8 = [' hello ']print [L[0].encode ('gbk') )] # Error Example
An error has occurred
The correct way is to convert Utf-8 to Unicode using decode, and convert Unicode using encode to the desired encoding GBK
# Coding:utf-8 = [' hello ']print [L[-1].decode ('utf-8') ). Encode ('gbk')] # utf-8, Unicode-GBK
Output
Successfully converted to GBK encoding, and a kanji account of 2 bytes
Summary :
(non-Unicode encoding). Decode (' non-Unicode ') converted to Unicode
(Unicode encoding). Encode (' non-Unicode ') conversion to non-Unicode
If you want to convert from a non-Unicode encoding to another non-Unicode encoding, you need to use Unicode as a springboard for decode, and then encode
The end of this section ...
Explore the use of encode and decode (Python)