Read and simple regular use of files in Python (i)

Source: Internet
Author: User


Today want to write a program to merge files, have always felt that python encoding decoding is annoying, as long as the file merge and so on are written in C # , but the recent use of Linux, there is no vs, can only obediently use python to write, in the morning to see the next, there is no responsibility I think, can only say that before too that what .... ok, Gossip less, The following is a brief introduction to the file read Operation.

First of all, I use the python2.7,python read the file content mainly has the following several commonly used methods: first, a test, you can clearly understand how each method is exactly what it looks like.

The contents of the file are as follows

One: Fopen.read (size)

The parameter size refers to the number of reads, if omitted, that represents the reading of the entire file content

1 # Coding=utf-8 2 fopen=open ('train2.txt','r')3 text=fopen.read ()4print(text)# reads everything by default

Displays the entire text content, as Follows:

two kinds: Fopen. ReadLine () reads the contents of a file line


three kinds: Fopen.readlines () Read all the rows to List inside , [line1,line2,... linen] This is a common way to avoid loading all files into memory to improve operational efficiency

Fopen=open ('train2.txt','r') lines=[]lines =fopen.readlines () for in lines:    print(line)

Read the file almost like this, below we write the file, This involves coding problem,python write file is fout.write (str), str is a string.

Writes to Chinese characters are mostly error-prone because of coding problems.

pythe file Defaults toASCIICoding , so when you display chinese, you will be prompted by T syntaxerror:non-ascii character and the Like., first add to the code front:

# Coding=utf-8

Unicode is a built-in function, the second parameter indicates the encoding format of the source String.

Decode is the method that any string has, converting the string to a Unicode format, The parameter indicates the encoding format of the source String.

< Span style= "font-family:arial" > encode The is also a method of any string that converts the string to the format specified by the Parameter.

with u ' kanji ' constructed with Unicode type, constructed as str type

from Unicode turn str , to use < Span>encode method

from Str Turn Unicode , to use Decode

So when writing in chinese, the code is as Follows:

#Coding=utf-8Fopen=open ('Train2.txt','R') Fout=open ('2.txt','W') Lines=[]lines=Fopen.readlines () forLineinchLines#read in GBK encoding (of course, read GBK encoded Text) and ignore the wrong encoding, converted to UTF-8 encoded outputLine.decode ('GBK','Ignore'). Encode ('Utf-8') fout.write (line) fout.write ('\ n') #这是换行

well, read and write files are almost enough, the following is a simple regular, this is placed in the next chapter

Read and simple regular use of files in Python (i)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.