Python3 Chinese file read/write method, python3 read/write
The character string is Unicode encoded in Python. Therefore, during encoding and conversion, Unicode is usually used as the intermediate encoding, that is, the other encoded strings are decoded into Unicode, then, convert the Unicode encoding (encode) into another encoding.
In the new version of python3, the unicode type is removed, instead of the string type of the unicode character (str). The string type (str) becomes the basic type as follows, the encoded value is changed to bytes, but the usage of the two functions remains unchanged:
decode encodebytes ------> str(unicode)------>bytes
U = 'China' # specifies the string type object u str = u. encode ('gb2312') # encode u with gb2312 encoding to obtain the bytes type object str u1 = str. decode ('gb2312') # decodes the string 'str' encoded in gb2312 to obtain the string type object u1 u2 = str. decode ('utf-8') # If str is decoded with UTF-8, the original string content cannot be restored.
File Reading Problems
Edit the content. Note that the encoding format is optional when saving the content. For example, if you can select gb2312, use python to read the file content as follows:
F = open('test.txt ', 'R') s = f. read () # read the file content. If it is in an unrecognized encoding format (the identified encoding type is related to the system used ), here, the read will fail '''. Assume that the file is saved in gb2312 encoding ''' u = s. decode ('gb2312') # decode the content in the file storage format, after obtaining the unicode string ''', we can convert the content into various encodings ''' str = u. encode ('utf-8') # convert to UTF-8 encoded string strstr1 = u. encode ('gbk') # str1str1 = u. encode ('utf-16') # convert to UTF-16 encoded string str1
Codecs reads files
Python provides a codecs package for file reading. The open () function in this package can specify the encoding type:
Import codecs f = codecs. open ('text. text ', 'r +', encoding = 'utf-8') # The file encoding format must be known in advance. The file encoding here is the UTF-8 content = f. read () # If the encoding used during open is inconsistent with the encoding of the file itself, the error f will be generated here. write ('information you want to write ') f. close ()
The above Python3 Chinese File Reading and Writing Method is all the content shared by the editor. I hope to give you a reference and support for the house of friends.