Reading and Writing txt files using different codes in Python

Source: Internet
Author: User

Reading and Writing txt files using different codes in Python

This article mainly introduces how to read and write txt files using different codes in Python. This article provides code methods for reading and writing files under different codes. For more information, see

The Code is as follows:

Import OS

Import codecs

Filenames = OS. listdir (OS. getcwd ())

Out = file ("name.txt", "w ")

For filename in filenames:

Out. write (filename. decode ("gb2312"). encode ("UTF-8 "))

Out. close ()

Write the current directory and name of the execution file to the name.txt file and save it in UTF-8 format.

If it is saved in ANSI encoding, write the following code:

The Code is as follows:

Out. write (filename)

Open the file and write

REFERENCE The codecs module. I do not know about this module currently. Record the method here to learn about the functions and usage of this module.

The Code is as follows:

Import codecs

File = codecs. open ("lol.txt", "w", "UTF-8 ")

File. write (u "I ")

File. close ()

Read ANSI-encoded text files and UTF-8-encoded files

Read ANSI encoded files

Create a file named test.txt in ANSI format with the following content:

The Code is as follows:

Abc Chinese

Read data using python

The Code is as follows:

# Coding = gbk

Print open ("Test.txt"). read ()

Result: abc (Chinese)

Read UTF-8 encoded files (without BOM)

The file format into UTF-8:

The Code is as follows:

Result: abc Juan

Obviously, decoding is required here:

The Code is as follows:

#-*-Coding: UTF-8 -*-

Import codecs

Print open ("Test.txt"). read (). decode ("UTF-8 ")

Result: abc (Chinese)

Read UTF-8 encoded files (with BOM)

Some software inserts three invisible characters (0xEF 0xBB 0xBF, or BOM) at the beginning of the file by default when saving a UTF-8-encoded file ). Some software controls whether to insert BOM. If you need to remove these characters when reading a BOM, The codecs module in python defines the constant:

The Code is as follows:

#-*-Coding: UTF-8 -*-

Import codecs

Data = open ("Test.txt"). read ()

If data [: 3] = codecs. BOM_UTF8:

Data = data [3:]

Print data. decode ("UTF-8 ")

Result: abc (Chinese)

Let's look at the example below:

The Code is as follows:

#-*-Coding: UTF-8 -*-

Data = open ("name_utf8.txt"). read ()

U = data. decode ("UTF-8 ")

Print u [1:]

Open a file in UTF-8 format and read the UTF-8 string, and then decode it into a unicode object. However, the added three characters are converted to a unicode character. This character cannot be printed. For normal display, use the u [1:] method to filter the first character.

Note: When processing unicode Chinese strings, you must first call the encode function to convert it to other encoding outputs.

Set python default encoding

The Code is as follows:

Import sys

Reload (sys)

Sys. setdefaultencoding ("UTF-8 ")

Print sys. getdefaultencoding ()

Today I encountered a python encoding problem. The error message is as follows:

The Code is as follows:

Traceback (most recent call last ):

File "ntpath. pyc", line 108, in join

UnicodeDecodeError: 'ascii 'codec can't decode byte 0xa1 in position 36: ordinal not in range (128)

Obviously, the current encoding is ascii, and 0xa1 cannot be parsed (decimal: 161, exceeds the upper limit of 128). After Entering the python console, it is found that the default encoding is ascii, and the verification process is:

Sys. setdefaultencoding () function to modify the default encoding, because python calls site at startup. py file. After the default encoding is set in this file, the setdefaultencoding method of sys is deleted. It cannot be called again. After confirming that sys has been imported, You can reload the sys module and then sys. setdefaultencoding ('utf8 ')

The Code is as follows:

Import sys

Reload (sys)

Sys. setdefaultencoding ("UTF-8 ")

Print sys. getdefaultencoding ()

It does. According to limodou, site. py is a script loaded by default after the python interpreter is started. If it is started using python-S, site. py will not be automatically loaded.

The above is pretty cool.

========================================

How can I set the default encoding to UTF-8 permanently? There are two methods:

========================================

Method 1 <not recommended>: Edit site. py, modify the setencoding () function, and set it to UTF-8.

Method 2 <recommended>: Add sitecustomize. py. The recommended path is under the site-packages directory.

Sitecustomize. py is in site. py is executed by import because sys. setdefaultencoding () is in site. the last deleted by py, so you can click sitecustomize. py uses sys. setdefaultencoding ().

The Code is as follows:

Import sys

Sys. setdefaultencoding ('utf-8 ')

Since sitecustomize. py can be automatically loaded, you can set some other things besides encoding.

String Encoding

The Code is as follows:

S1 = 'Chinese'

Strings directly entered as above are processed according to the code file encoding. For unicode encoding, there are three methods:

The Code is as follows:

1 s1 = u'chinese'

2 s2 = unicode ('Chinese', 'gbk ')

3 s3 = s1.decode ('gbk ')

Unicode is a built-in function. The second parameter indicates the encoding format of the source string.

Decode is a method used by any string to convert the string to unicode format. The parameter indicates the encoding format of the source string.

Encode is also a method of any string. It converts a string to the format specified by the parameter.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.