Python text garbled characters occur in many places when we use them. In fact, we need to constantly adjust the relevant code. In the following article, you will find related solutions. I hope you will learn it carefully.
During utf8 text processing, because these texts contain utf8 texts of BOM (byte order mark), an Error is generated during compilation, "UnicodeEncodeError: 'gbk' codec can't encode character U' \ ufeff 'in position 0: illegal multibyte sequence"
Originally, some software, such as notepad, will insert three invisible characters 0xEF 0xBB 0xBF at the beginning of the file when saving a file encoded in UTF-8 ). Therefore, we need to remove these characters during reading. The codecs module in Python defines this constant:
View Source code printing help
- Import codecs
- Data = open ("Test.txt"). read ()
- If data [: 3] = codecs. BOM_UTF8:
- Datadata = data [3:]
- Print data. decode ("UTF-8 ")
The above is a detailed introduction to code modification when Python text garbled occurs.