Python Chinese garbled Problem Analysis Solution Introduction

Source: Internet
Author: User

The following is an analysis of Python Chinese Garbled text. By creating a file, you can use python in the file to read the file and related code examples. The following article describes a solution to Python Chinese Garbled text, the following is a detailed description of the article. I hope you will get some benefits.

Create a file named test.txt in ANSI format. The file content is: abc Chinese, which is read in python.

 
 
  1. # coding=gbk 
  2. print open("Test.txt").read()  
  3.  

Result: abc Chinese changed the file format to UTF-8 which needs to be decoded here:

 
 
  1. # coding=gbk 
  2. import codecs  
  3. print open("Test.txt").read().decode("utf-8")  
  4.  

Result: I used Editplus to edit test.txt of abc中文, but when I used the notepad editor in Windows and saved it into the UTF-8 format, an error was reported:

 
 
  1. Traceback (most recent call last):  
  2. File "ChineseTest.py", line 3, in   
  3. print open("Test.txt").read().decode("utf-8")  
  4. UnicodeEncodeError: 'gbk' codec can't encode 
    character u'\ufeff' in position 0: illegal multibyte 
    sequence   

Originally, some software, such as notepad, will insert three invisible characters 0xEF 0xBB 0xBF at the beginning of the file when saving a file encoded in UTF-8 ). Therefore, we need to remove these characters during reading. The codecs module in python defines this constant:

 
 
  1. # coding=gbk 
  2. import codecs  
  3. data = open("Test.txt").read()  
  4. if data[:3] == codecs.BOM_UTF8:  
  5. datadata = data[3:]  
  6. print data.decode("utf-8")   

Result: abc (Chinese)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.