Dealing with Chinese garbled characters in Python2.7 read-write files

Source: Internet
Author: User

1. Set the default encoding

In the Python code anywhere in the Chinese language, compile will be error, when the code can be added to the first line of the corresponding instructions, clear UTF-8 encoding format, you can solve the general situation of the Chinese error. Of course, the specific problems encountered in programming need to be analyzed.

#encoding:utf-8或者# -*- coding: utf-8 -*- import sys  reload(sys)  sys.setdefaultencoding(’utf8’)  # 设置默认编码格式为‘utf-8‘
2. file reading and writing

File reading and writing encountered in Chinese, usually do not error, but the final running results show garbled, to the subsequent processing inconvenience.

2.1 Read File

When reading a file, if the file path, the filename is in Chinese, you need to use the Unicode function to encode it as ' utf-8 ' format, and then do the normal file read. Taking the read_csv function of my usual pandas as an example, the following code can be used to successfully read the CSV file named "Poi Total Table", which is saved in poi_list of the Dataframe data type.

import pandas as pdinpath = ‘C:\\POI总表.csv‘**path = unicode(inpath, ‘utf-8‘)**poi_list = pd.read_csv(path)
2.2 Writing Files
    1. File name is Chinese, file name is garbled
      When you want to save the results of a program run to a text file, the name of the text file if there is Chinese, do not do processing file names will appear garbled. Encoding is solvable using Unicode functions. Unicode (' Chinese. csv ', ' utf-8 ')

    2. File content has Chinese, Excel open content garbled
      If you export the results that contain Chinese to a CSV file, the file contents will be garbled when you open the file by default using Excel, while opening with a text editor is not garbled. This is because Excel's default encoding is ' GBK ' and the text editor defaults to ' Utf-8 '. Use the codecs package to add a statement f.write (codecs) after the file is created. BOM_UTF8) can be solved

name=‘语文‘f = open(name+‘.csv‘,‘w‘)f.write(‘123,语文‘)f.close()#修改编码import codecsf = open(**unicode(name+‘.csv‘,‘utf-8‘)**,‘w‘) # 文件名不乱码**f.write(codecs.BOM_UTF8) # excel打开内容不乱码的核心语句**f.write(‘123,语文‘)f.close()

Output Result:

#文件名:璇枃.csv#Excel打开   123  璇枃#文本编辑器打开 123,语文#改编码后#文件名:语文.csv#Excel打开 123 语文#文本编辑器打开 123,语文

Dealing with Chinese garbled characters in Python2.7 read-write files

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.