Python uses different code to read and write txt files in detail

Source: Internet
Author: User
Tags in python

This article mainly introduces the use of different code to read and write TXT file in Python, this article gives the code of read and write files under different coding methods, the need for friends can refer to the

The code is as follows:

Import OS

Import Codecs

Filenames=os.listdir (OS.GETCWD ())

Out=file ("Name.txt", "W")

For filename in filenames:

Out.write (Filename.decode ("gb2312"). Encode ("Utf-8")

Out.close ()

Writes the current directory and file name of the executing file to the Name.txt file and saves it in UTF-8 format

If you use ANSI encoding to save, you can write with the following code:

The code is as follows:

Out.write (filename)

Open file and write to

The codecs module is referenced and is not currently known to the module. In this record method, have the time to master the function and usage of the module.

The code is as follows:

Import Codecs

File=codecs.open ("Lol.txt", "W", "Utf-8")

File.write (U "i")

File.close ()

Read ANSI-encoded text files and Utf-8 encoded files

Reading ANSI encoded files

Create a file test.txt, file format with ANSI, content:

The code is as follows:

ABC Chinese

Using Python to read

The code is as follows:

# CODING=GBK

Print open ("Test.txt"). Read ()

Result: ABC Chinese

Read UTF-8 encoded files (no BOM)

Change the file format to UTF-8:

The code is as follows:

Result: ABC Juan PO

Obviously, you need to decode this:

The code is as follows:

#-*-Coding:utf-8-*-

Import Codecs

Print open ("Test.txt"). Read (). Decode ("Utf-8")

Result: ABC Chinese

Read Utf-8 encoded file (with BOM)

Some software, when saving a file encoded in UTF-8, inserts three invisible characters (0xEF 0xBB 0xBF, or BOM) where the file begins. In some software you can control whether to insert a BOM. If, in the case of a BOM, you need to remove these characters when reading, the codecs module in Python defines this constant:

The code is as follows:

#-*-Coding:utf-8-*-

Import Codecs

data = open ("Test.txt"). Read ()

If data[:3] = = codecs. Bom_utf8:

data = Data[3:]

Print Data.decode ("Utf-8")

Result: ABC Chinese

Look at the following example:

The code is as follows:

#-*-Coding:utf-8-*-

data = open ("Name_utf8.txt"). Read ()

U=data.decode ("Utf-8")

Print U[1:]

When you open a file in utf-8 format and read the Utf-8 string, the decoding becomes a Unicode object. However, the additional three characters will be converted into a Unicode character. The character cannot be printed. So for normal display, use u[1:] The way to filter to the first character.

Note: When handling Unicode Chinese strings, you must first call the Encode function on it and convert it to another encoded output.

Set the Python default encoding

The code is as follows:

Import Sys

Reload (SYS)

Sys.setdefaultencoding ("Utf-8")

Print sys.getdefaultencoding ()

Today I ran into the Python coding problem, the error message is as follows

The code is as follows:

Traceback (most recent call last):

File "Ntpath.pyc", line 108, in join

Unicodedecodeerror: ' ASCII ' codec can ' t decode byte 0xa1 in position 36:ordinal not in range (128)

Obviously the current encoding is ASCII and cannot parse 0xa1 (decimal 161, exceeding the upper limit of 128). After entering the Python console, we found that the default encoding is indeed ASCII, and the verification process is:

The sys.setdefaultencoding () function cannot be invoked in python2.6 to modify the default encoding. Because Python invokes the site.py file at startup, the Setdefaultencoding method of SYS is removed when the default encoding is set in this file. Can no longer be invoked. After you have determined that SYS has been imported, you can reload the Sys module and then sys.setdefaultencoding (' UTF8 ')

The code is as follows:

Import Sys

Reload (SYS)

Sys.setdefaultencoding ("Utf-8")

Print sys.getdefaultencoding ()

Really works, according to Limodou, site.py is a script that is loaded by default after the Python interpreter starts. If you start with Python-s, the site.py will not be loaded automatically.

It's kind of long-winded.

==================================

How do you permanently set the default encoding to Utf-8? There are 2 different ways:

==================================

The first method < do not recommend: Edit site.py, modify setencoding () function, force set to Utf-8

The second method < recommendation: Add a name sitecustomize.py, recommended to store the path for the Site-packages directory

Sitecustomize.py was executed in site.py, because Sys.setdefaultencoding () was last deleted at site.py, so sitecustomize.py can be used Tdefaultencoding ().

The code is as follows:

Import Sys

Sys.setdefaultencoding (' Utf-8 ')

Since sitecustomize.py can be loaded automatically, you can set up some other things besides coding.

Encoding of strings

The code is as follows:

s1= ' Chinese '

Strings entered directly like the one above are processed according to code file encoding, and in the case of Unicode encoding, there are three ways to do this:

The code is as follows:

1 s1 = U ' Chinese '

2 s2 = Unicode (' Chinese ', ' GBK ')

3 S3 = S1.decode (' GBK ')

Unicode is a built-in function, and the second parameter indicates the encoding format of the source string.

Decode is a method of any string that converts a string to Unicode format, and a parameter indicates the encoding format of the source string.

Encode is also a method of any string that converts a string into a format specified by the parameter.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.