Python3 Chinese file read/write method, python3 read/write

Source: Internet
Author: User

Python3 Chinese file read/write method, python3 read/write

The character string is Unicode encoded in Python. Therefore, during encoding and conversion, Unicode is usually used as the intermediate encoding, that is, the other encoded strings are decoded into Unicode, then, convert the Unicode encoding (encode) into another encoding.

In the new version of python3, the unicode type is removed, instead of the string type of the unicode character (str). The string type (str) becomes the basic type as follows, the encoded value is changed to bytes, but the usage of the two functions remains unchanged:

  decode    encodebytes ------> str(unicode)------>bytes
U = 'China' # specifies the string type object u str = u. encode ('gb2312') # encode u with gb2312 encoding to obtain the bytes type object str u1 = str. decode ('gb2312') # decodes the string 'str' encoded in gb2312 to obtain the string type object u1 u2 = str. decode ('utf-8') # If str is decoded with UTF-8, the original string content cannot be restored.

File Reading Problems

Edit the content. Note that the encoding format is optional when saving the content. For example, if you can select gb2312, use python to read the file content as follows:

F = open('test.txt ', 'R') s = f. read () # read the file content. If it is in an unrecognized encoding format (the identified encoding type is related to the system used ), here, the read will fail '''. Assume that the file is saved in gb2312 encoding ''' u = s. decode ('gb2312') # decode the content in the file storage format, after obtaining the unicode string ''', we can convert the content into various encodings ''' str = u. encode ('utf-8') # convert to UTF-8 encoded string strstr1 = u. encode ('gbk') # str1str1 = u. encode ('utf-16') # convert to UTF-16 encoded string str1

Codecs reads files

Python provides a codecs package for file reading. The open () function in this package can specify the encoding type:

Import codecs f = codecs. open ('text. text ', 'r +', encoding = 'utf-8') # The file encoding format must be known in advance. The file encoding here is the UTF-8 content = f. read () # If the encoding used during open is inconsistent with the encoding of the file itself, the error f will be generated here. write ('information you want to write ') f. close ()

The above Python3 Chinese File Reading and Writing Method is all the content shared by the editor. I hope to give you a reference and support for the house of friends.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.