Merge text files using python,
Python merge text file sample code.
Python combines two texts
Employee ID and name are recorded in the employee file
Cat employee.txt:
100 Jason Smith200 John Doe300 Sanjay Gupta400 Ashok Sharma
Bonus file records employee ID and salary
Cat bonus.txt:
100 $5,000200 $500300 $3,000400 $1,250
Merge the two files and output the following results:
400 ashok sharma $1,250100 jason smith $5,000200 john doe $500300 sanjay gupta $3,000
This should be written using shell, but I have little knowledge about shell, so I can use python to implement it.
Note: According to the question, the output file must also be sorted by the first letter of the name.
#! /usr/bin/env python #coding=utf-8fp01=open("bonus.txt","r")a=[]for line01 in fp01:a.append(line01)fp02=open("employee.txt","r")fc02=sorted(fp02,key=lambda x:x.split()[1])for line02 in fc02:i=0while line02.split()[0]!=a[i].split()[0]:i+=1print "%s %s %s %s" % (line02.split()[0],line02.split()[1],line02.split()[2],a[i].split()[1])fp01.close()fp02.close()
Let's take a look at the Code with the same function.
# coding gbk # # author: GreatGhoul # email : greatghoul@gmail.com # blog : http://greatghoul.javaeye.com import sys,os,msvcrt def join(in_filenames, out_filename): out_file = open(out_filename, 'w+') err_files = [] for file in in_filenames: try: in_file = open(file, 'r') out_file.write(in_file.read()) out_file.write('\n\n') in_file.close() except IOError: print 'error joining', file err_files.append(file) out_file.close() print 'joining completed. %d file(s) missed.' % len(err_files) print 'output file:', out_filename if len(err_files) > 0: print 'missed files:' print '--------------------------------' for file in err_files: print file print '--------------------------------' if __name__ == '__main__': print 'scanning...' in_filenames = [] file_count = 0 for file in os.listdir(sys.path[0]): if file.lower().endswith('[all].txt'): os.remove(file) elif file.lower().endswith('.txt'): in_filenames.append(file) file_count = file_count + 1 if len(in_filenames) > 0: print '--------------------------------' print '\n'.join(in_filenames) print '--------------------------------' print '%d part(s) in total.' % file_count book_name = raw_input('enter the book name: ') print 'joining...' join(in_filenames, book_name + '[ALL].TXT') else: print 'nothing found.' msvcrt.getch()
Finally, let's look at the situation of a small Editor:
During the compilation today, I saw a novel crazy programmer in my blog. So I searched the internet and tried to put it in my cell phone, after the download, we found that the txt text is divided into chapters. There are a total of 87 files. Considering that reading is not very convenient, we wanted to find a ready-made tool to merge the txt text.
After trying a few tools, I thought the merge effect was not good, so I planned to do it myself. In fact, the cmd command "type *. txt> crazy-programmer.txt" is still very effective, but the txt file after the merger is very large, so I wrote a script to complete the merger.
Note: Because the character encoding formats of the 87 txt files I downloaded are not uniform, I used the chardet module to determine the character encoding type and then used the codecs module's codecs. the open function solves the encoding problem. If you open the txt file directly with file open, in the case of UCS-2 Little Endian encoding, file. when read () encounters a Chinese colon (":"), it will not be able to read the content after the colon, so you need to use codecs. open (path, 'R', encoding.
If you have any questions, leave a message using the following code:
#!coding: cp936 import codecs, chardet def fileopen(filename): f = open(filename, 'r') s = f.read() if(chardet.detect(s)['encoding'] == 'UTF-16LE'): f.close() f = codecs.open(filename, 'r', 'utf-16-le') data = f.read().encode('gb2312', 'ignore') f.close() elif(chardet.detect(s)['encoding'] == 'GB2312'): data = s f.close() return data i = 1 while i <=87: if(i < 10): filename = '0'+str(i)+'.txt' else: filename = str(i)+'.txt' text = fileopen(filename) file('crazy-p.txt', 'a+').write(text) i = i+1
Among them, the chardet module needs to be downloaded and installed, and the script can be improved to adapt to more situations, so I will be lazy.
Articles you may be interested in:
- Python combines multiple text files into a single text code (easy to search)
- Python text file merging example
- Example of merging two text files in python and sorting them by the first letter