Python merges all Excel files in a directory into a tutorial

Source: Internet
Author: User
Tags glob

I. Demand

The company asked to merge all the handling events from January 1, 16 to the present, simply by merging all the dailies into a single table. Just 16 all the dailies are kept in a folder, where all the logs are aggregated into an Excel file through Python + xlrd (read) + xlsxwriter (write) implementations. There is no XLWT module for write operations, because the current mainstream use of office is basically 2007 after the, so xlsxwriter support is relatively better.

The format of the original daily is as follows:

You need to summarize the following format

Second, Python implementation

First code:

[Root@361way ~]# Cat python/excel/date_report.py
#!/usr/bin/env python
#-*-Coding:utf-8-*-
# site:www.361way.com
# mail:itybku@139.com
Import Sys
Reload (SYS)
Sys.setdefaultencoding (' Utf-8 ')
Import Glob
Import xlrd
Import Xlsxwriter
Import re
Workbook = Xlsxwriter. Workbook (' date_report.xlsx ')
Worksheet = Workbook.add_worksheet (' Sheet1 ')
Bold = Workbook.add_format ({' Bold ': 1})
headings = [u ' time ', u ' problem description ', U ' responsible person ']
Worksheet.write_row (' A1 ', headings, bold)
def get_xlsx_data (file):
#data = Xlrd.open_workbook (u "d:\\2016\\ Hangzhou software Services Management Service work daily (2016-07-13). xlsx")
data = Xlrd.open_workbook (file)
Table = data.sheets () [0]
nrows = Table.nrows
Ncols = Table.ncols
Timedata = table.row_values (0) [0]
Timedata = Re.split (' (', Timedata.encode ("utf-8", ' ignore ')
Timedata = Timedata[1].replace (') ', ')
#print Timedata
For I in Range (7,nrows):
#print table.row_values (i) [0]
If Table.row_values (i) [0] = = 1.0:
MAXV = i
#print MAXV
If MAXV:
For I in range (6, (Maxv-2)):
If Table.row_values (i) [2]:
#print timedata + ' | ' + table.row_values (i) [2].encode ("utf-8", ' ignore ')
Data_arry.append ([Timedata,table.row_values (i) [2].encode ("utf-8", ' ignore '), ' 361way '])
Else
For I in Range (6,nrows):
If Table.row_values (i) [2]:
#print timedata + ' | ' + table.row_values (i) [2].encode ("utf-8", ' ignore ')
Data_arry.append ([Timedata,table.row_values (i) [2].encode ("utf-8", ' ignore '), ' 361way '])
Data_arry = []
For file in Glob.glob ("D:\\2016\\*.xlsx"):
Get_xlsx_data (file)
row = 1
Col = 0
For Linev in Data_arry:
#print Linev
Worksheet.write_row (Row,col,linev)
row = 1
Workbook.close ()

The code is simpler, but there are a few places to mention:

1, traversing a directory of all the files, here with the Glob module, its internal is also through the code such as Os.path,isdir to achieve, practical is quite convenient;

2, write Excel here is the way to write in line, for a small amount of data, through row, column write does not matter, for example, column data has more than hundreds of thousands of, it is not recommended to write columns through the list of the way--memory consumption, it is recommended that the results first written to the file, and then read by line, Write-by-article. Examples of official writes by column are as follows:

data = [
[2, 3, 4, 5, 6, 7],
[10, 40, 50, 20, 10, 50],
[30, 60, 70, 50, 40, 30],
]
Worksheet.write_row (' A1 ', headings, bold)
Worksheet.write_column (' A2 ', data[0])
Worksheet.write_column (' B2 ', data[1])
Worksheet.write_column (' C2 ', data[2])

Here is written by means of A1, B2, if you write in the same way, line by row, column 0 means not less, you can see my code in the Worksheet.write_row (Row,col,linev).

3, the title on the date of this, the use of the RE module for the cutting, can also be matched through the re to obtain, of course, can also be divided into ordinary segmentation string method, but the code is more, similar to the following:

Timedata = table.row_values (0) [0]
Timedata = Timedata.splitlines () [1]
Timedata = Timedata.replace (' (', ')
Timedata = Timedata.replace (') ', ')

The middle MAXV section is mainly to remove the plans for tomorrow and avoid duplication with the daily newspaper next day. Write the feeling a bit repetitive, also can function call, too lazy to change, the code is too simple.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.