Solve the problem of mixed errors in Chinese and English in Python

Source: Internet
Author: User

I wrote A small script in Python two days ago. One of the requirements is to read data from A text file A, and then process it and write it into A new text file B. However, file A contains both English and Chinese characters.
Writing the code to handle this requirement is not complicated. You can easily write similar code:

1def write_a_line(line, fp):
2    fp.write(line)

However, once this program encounters a string with Chinese characters, it may encounter the following problems:

Traceback (most recent call last):

File "<stdin>", line 1, in <module>

UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-1:

ordinal not in range(128)

Well, this is a big case for programmers who usually use C # or Java. According to the error message, when writing a file, Python encodes the line variable and encodes the line Object in ASCII code.

We all know that Python supports unicode and the unicode type is "unicode ". To make a string a unicode string, you can add a lowercase letter "u" before the string's quotation marks ". After reading the Python documentation, we can send the write method to receive the str object. But Python is not like C #. It does not need to declare the type of the variable. So we should first look at what line is. To solve this problem, I added a line of the above Code:

1def write_a_line(line, fp):
2    print line.__class__.__name__
3    fp.write(line)

After the code is run, I found that when the line contains Chinese characters, the type of our line variable is unicode. I searched for a post related to unicode and file writing on the Internet to get a message. After encoding the unicode object by calling the encode method, I can write the file normally. The code is programmed as follows:

1def write_a_line(line, fp):
2    if line.__class__.__name__ == “unicode”:
3        line = line.encode(“GB2312”)
4    fp.write(line)

Here, we should note that we cannot beat a single stick, line may be a unicode object or str object, but we only encode the unicode object. The problem is finally solved.

  1. Introduction to Python exception handling system
  2. Use Oracle database for Python data persistence
  3. Learn how to write plug-ins using Python

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.