If you do not know the format of the Chinese encoding when opening a file, for example, the file header may be specified as utf8, but the actual encoding is not, this is the case, you can use Python vim to check the encoding. The following is a detailed description of the article.
Open a Chinese file and it is unclear what the Chinese encoding format is. The python source program file header may be utf8, but the actual encoding is gbk. Inconsistent encoding in python source code may result in an error during execution. One solution is to view the binary data, but what encoding does the binary data of Chinese characters correspond?
Add two lines in vimrc of vim:
- set fenc=utf-8
- set fileencodings=utf-8,cp936,big5,euc-jp,
euc-kr,latin1,ucs-bom
In this way, the default file storage is UTF-8 encoding.
- set enc=cp936
This is the code displayed on the gvim interface. cp936 is used in windows and utf8 is used in linux. It is recommended that you do not set it.
If you are not sure whether a newly opened file is utf8 or gbk, use Pythonvim to open the file, view Chinese characters, and then run
- :%!xxd
See the corresponding binary. If the text contains "hello", you will see your hexadecimal representation at the corresponding position on the left. Open python3.0 and run the "hello" character in the text in the command line to perform binary transcoding.
- View plaincopy to clipboardprint?
- >>> A = 'hello'
- >>> B = a. encode ('utf8 ')
- >>> B
- B '\ xe4 \ xbd \ xa0 \ xe5 \ xa5 \ xbd'
- >>> C = a. encode ('gbk ')
- >>> C
- B '\ xc4 \ xe3 \ xba \ xc3'
- >>> A = 'hello'
- >>> B = a. encode ('utf8 ')
- >>> B
- B '\ xe4 \ xbd \ xa0 \ xe5 \ xa5 \ xbd'
- >>> C = a. encode ('gbk ')
- >>> C
- B '\ xc4 \ xe3 \ xba \ xc3'
As you can see, for the Chinese "hello" binary, utf8 is
- 0xe4ba0 0xe5a5bd
For gbk, gb2312, cp936, and gb18030, the binary value is 0xc4e3 0xbac3, Which is compared with the binary value in the Python vim check encoding. After knowing the encoding, use
- :%!xxd -r
Command to convert the hexadecimal format into plain text and save it. For existing text, you can use iconv to transcode it in linux. The above section describes how to check the encoding of Python3.0 and Python vim.