1. Set the default encoding
In the Python code anywhere in the Chinese language, compile will be error, when the code can be added to the first line of the corresponding instructions, clear UTF-8 encoding format, you can solve the general situation of the Chinese error. Of course, the specific problems encountered in programming need to be analyzed.
#encoding:utf-8或者# -*- coding: utf-8 -*- import sys reload(sys) sys.setdefaultencoding(’utf8’) # 设置默认编码格式为‘utf-8‘
2. file reading and writing
File reading and writing encountered in Chinese, usually do not error, but the final running results show garbled, to the subsequent processing inconvenience.
2.1 Read File
When reading a file, if the file path, the filename is in Chinese, you need to use the Unicode function to encode it as ' utf-8 ' format, and then do the normal file read. Taking the read_csv function of my usual pandas as an example, the following code can be used to successfully read the CSV file named "Poi Total Table", which is saved in poi_list of the Dataframe data type.
import pandas as pdinpath = ‘C:\\POI总表.csv‘**path = unicode(inpath, ‘utf-8‘)**poi_list = pd.read_csv(path)
2.2 Writing Files
File name is Chinese, file name is garbled
When you want to save the results of a program run to a text file, the name of the text file if there is Chinese, do not do processing file names will appear garbled. Encoding is solvable using Unicode functions. Unicode (' Chinese. csv ', ' utf-8 ')
File content has Chinese, Excel open content garbled
If you export the results that contain Chinese to a CSV file, the file contents will be garbled when you open the file by default using Excel, while opening with a text editor is not garbled. This is because Excel's default encoding is ' GBK ' and the text editor defaults to ' Utf-8 '. Use the codecs package to add a statement f.write (codecs) after the file is created. BOM_UTF8) can be solved
name=‘语文‘f = open(name+‘.csv‘,‘w‘)f.write(‘123,语文‘)f.close()#修改编码import codecsf = open(**unicode(name+‘.csv‘,‘utf-8‘)**,‘w‘) # 文件名不乱码**f.write(codecs.BOM_UTF8) # excel打开内容不乱码的核心语句**f.write(‘123,语文‘)f.close()
Output Result:
#文件名:璇枃.csv#Excel打开 123 璇枃#文本编辑器打开 123,语文#改编码后#文件名:语文.csv#Excel打开 123 语文#文本编辑器打开 123,语文
Dealing with Chinese garbled characters in Python2.7 read-write files