Merge text files using python,

Source: Internet
Author: User

Merge text files using python,

Python merge text file sample code.

Python combines two texts

Employee ID and name are recorded in the employee file

Cat employee.txt:

100 Jason Smith200 John Doe300 Sanjay Gupta400 Ashok Sharma

Bonus file records employee ID and salary

Cat bonus.txt:

100 $5,000200 $500300 $3,000400 $1,250

Merge the two files and output the following results:

400 ashok sharma $1,250100 jason smith $5,000200 john doe $500300 sanjay gupta $3,000

This should be written using shell, but I have little knowledge about shell, so I can use python to implement it.
Note: According to the question, the output file must also be sorted by the first letter of the name.

#! /usr/bin/env python #coding=utf-8fp01=open("bonus.txt","r")a=[]for line01 in fp01:a.append(line01)fp02=open("employee.txt","r")fc02=sorted(fp02,key=lambda x:x.split()[1])for line02 in fc02:i=0while line02.split()[0]!=a[i].split()[0]:i+=1print "%s %s %s %s" % (line02.split()[0],line02.split()[1],line02.split()[2],a[i].split()[1])fp01.close()fp02.close()

Let's take a look at the Code with the same function.

# coding gbk # # author: GreatGhoul # email : greatghoul@gmail.com # blog : http://greatghoul.javaeye.com   import sys,os,msvcrt   def join(in_filenames, out_filename):   out_file = open(out_filename, 'w+')       err_files = []   for file in in_filenames:     try:       in_file = open(file, 'r')       out_file.write(in_file.read())       out_file.write('\n\n')       in_file.close()     except IOError:       print 'error joining', file       err_files.append(file)   out_file.close()   print 'joining completed. %d file(s) missed.' % len(err_files)   print 'output file:', out_filename   if len(err_files) > 0:     print 'missed files:'     print '--------------------------------'     for file in err_files:       print file     print '--------------------------------'   if __name__ == '__main__':   print 'scanning...'   in_filenames = []   file_count = 0   for file in os.listdir(sys.path[0]):     if file.lower().endswith('[all].txt'):       os.remove(file)     elif file.lower().endswith('.txt'):       in_filenames.append(file)       file_count = file_count + 1   if len(in_filenames) > 0:     print '--------------------------------'     print '\n'.join(in_filenames)     print '--------------------------------'     print '%d part(s) in total.' % file_count     book_name = raw_input('enter the book name: ')     print 'joining...'     join(in_filenames, book_name + '[ALL].TXT')   else:     print 'nothing found.'   msvcrt.getch()

Finally, let's look at the situation of a small Editor:

During the compilation today, I saw a novel crazy programmer in my blog. So I searched the internet and tried to put it in my cell phone, after the download, we found that the txt text is divided into chapters. There are a total of 87 files. Considering that reading is not very convenient, we wanted to find a ready-made tool to merge the txt text.

After trying a few tools, I thought the merge effect was not good, so I planned to do it myself. In fact, the cmd command "type *. txt> crazy-programmer.txt" is still very effective, but the txt file after the merger is very large, so I wrote a script to complete the merger.

Note: Because the character encoding formats of the 87 txt files I downloaded are not uniform, I used the chardet module to determine the character encoding type and then used the codecs module's codecs. the open function solves the encoding problem. If you open the txt file directly with file open, in the case of UCS-2 Little Endian encoding, file. when read () encounters a Chinese colon (":"), it will not be able to read the content after the colon, so you need to use codecs. open (path, 'R', encoding.

If you have any questions, leave a message using the following code:

#!coding: cp936 import codecs, chardet  def fileopen(filename):   f = open(filename, 'r')   s = f.read()   if(chardet.detect(s)['encoding'] == 'UTF-16LE'):     f.close()     f = codecs.open(filename, 'r', 'utf-16-le')         data = f.read().encode('gb2312', 'ignore')     f.close()   elif(chardet.detect(s)['encoding'] == 'GB2312'):     data = s     f.close()   return data  i = 1 while i <=87:   if(i < 10):     filename = '0'+str(i)+'.txt'   else:     filename = str(i)+'.txt'   text = fileopen(filename)   file('crazy-p.txt', 'a+').write(text)   i = i+1 

Among them, the chardet module needs to be downloaded and installed, and the script can be improved to adapt to more situations, so I will be lazy.

Articles you may be interested in:
  • Python combines multiple text files into a single text code (easy to search)
  • Python text file merging example
  • Example of merging two text files in python and sorting them by the first letter

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.