General processing of tabular data for dry--excel and common python modules.

Source: Internet
Author: User

Write in front:

The main purpose of this article is to:

    • This paper introduces the common Excel processing modules used in Python: xlwt,xlrd,xllutils,openpyxl,pywin32 usage and application scenarios.
    • This article is only for the processing of table data commonly used in Excel tables, other complex operations such as formulas, histograms and other data are not involved.
    • Big guy's shoulder: http://www.gocalf.com/blog/python-read-write-excel.html#excel
Read Module 1:XLRD official Quick Start (haste is sometimes useful)
ImportXlrdbook= Xlrd.open_workbook ("Myfile.xls")#Open FilePrint("The number of worksheets is {0}". Format (book.nsheets))#Nsheets is the number of sheet in ExcelPrint("Worksheet Name (s): {0}". Format (Book.sheet_names ()))#Sheets's name.SH = book.sheet_by_index (0)#get the first sheet * * Note xlrd start from 0Print("{0} {1} {2}". Format (Sh.Name, Sh.nrows, Sh.ncols))#name of the sheet, number of rows, number of columnsPrint("Cell D30 is {0}". Format (Sh.cell_value (rowx=29, colx=3)))#D30 The value in cell forRxinchRange (sh.nrows):Print(Sh.row (RX))#traverse each row to print each value. Each row is a list

Detailed parameters and features

WB = Xlrd.open_workbook ("Myfile.xls")

There is also an important parameter: Formatting_info, which defaults to false, and preserves the format and style information of the XLS file when it is opened. Embodied in the row_values. More in the xlutils.

File_contents: The second way of opening. As the name implies is the binary stream of the file, the most useful is that the server receives the file without having to dump the direct read operation. Example:

WB = Xlrd.open_workbook (File_contents=f.read (), formatting_info=true) # f is an open file handle
WB = Xlrd.open_workbook (file_contents=request. File.get (' file ', '), formatting_info=true) # simulation View function

SH = book.sheet_by_index (0)

Other ways to open
WS == wb.sheet_by_name ('Sheet1'== wb._sheet_list[0]               # when the file_content way to open the top of the way will be an error, because all the sheets load has come in.  ws.row_values (row_num)         # Another common way to get cell data for a row. 
Ws.merged_cells

Merged_cells is a list that shows all the merged cells. For example: [(0, 1, 0, 3), (1, 2, 0, 3)]

What's weird here is that the numbers in the 0,0 are the values in the coordinates of the upper-left corner of the file. Use need attention.

Finally: XLRD is older, it is only supported in the XLS format of Excel files, can now read the xlsx file data, but can not use the Formatting_info function.

Write module: XLWT Quick Start
Import XLWTWB=XLWT. Workbook () WS= Wb.add_sheet ('Sheet1')#Sheet1 is the name of the first sheetWs.write (0, 0,'Test', Style1)#The style1 written in the No. 0 column of line No. 0 is the style described belowWs.write_merge (1, 2, 0, A, u'Merge Cells', Style1)#merges 25 columns of cells in the first row to writeWb.save ('Test2.xls')#wb.save (' test.xlsx ') # can generate xlsx file
Style design

Writing an Excel file requires not many of the parameters above, but often in style design requirements are high.

Content Center:
Alignment = XLWT. Alignment ()                                      #  Initializes a centered object Alignment.horz = XLWT. Alignment.horz_center    #  Horizontal and left center alignment.vert = XLWT. Alignment.vert_center
Border:
Borders == 1                                                         #  Border Width borders.right = 1= 1= 1= 0x3A                                 #  specific color design can be Baidu 255 in the color of the hexadecimal representation
Cell background:
Pattern == Pattern.solid_pattern            #  set its mode to real Pattern.pattern_fore_colour =                               #  Here's 22 bits grey
Font
FN1 == False                                                        #  bold fn1.name = u' Arial '                                                    #  Font fn1.height = 280                                                        #  size  font size affects row height

Cell size Settings
hang1 = Ws.col (0)                                                      #  get a column and set the width hang1.width =xlwt.easyxf ('font: Height; '  = ws.row (0)                                              #  get a line set height hang_sale.set_style (tall_style)
Synthetic style

Encapsulate the above styles into a style.

Style1 = XLWT. Xfstyle ()

Style1.font = fn1 # Similarly other styles are set in this way

Modify module: xlutils

The direct modification of Excel, which is not natively supported in the Excel table module, is implemented in the form of read-after-write. Therefore, in the process of modification there will be many bugs, such as the loss of style.

 from Import  = Open_workbook ('test.xls)WB = Copy (RB)

After the operation to facilitate XLWT no difference. However, in the modification of the general do not want to actively modify the style of Excel. Here, the Formatting_info parameter in the XLRD is critical. Even so, in my practice there are still some cell border style auto-loss situation, to be solved!

Summary: The above three brothers are more commonly used Excel processing module, if not too high demand is generally sufficient. But the three brothers are now more specific to the XLS file, in fact, in the format of the conversion of xlsx file is powerless. In the case of requirements such as modifications to the xlsx file, Xlutils will output the xlsx source file as an xlsx file without any format. So the following introduction to the OPENPYXL module is used only for xlsx. (XLS cannot read)

XLSX Module OPENPYXL
Book = Openpyxl.load_workbook ('2.xlsx'== ws.rowsPrint Next (AA) [0].value                             #  can also be set with equal value book.save ('test.xlsx') )

The usage of OPENPYXL is roughly the same as that of the XL three brothers, but the function is more, so the efficiency is worse. The READ_ONLY parameter can be added to the read operation only.

PYWIN32 Module

As the first link gives the comparison, the most functional nature is the Windows API. After all, this thing is out of the family. But the disadvantage is also the most obvious:

1. Cannot cross platform, only for Windows system, also have Excel program.

2. The process of Occupy Excel program, do not pay attention to cause and WPS and other programs conflict.

3. Very slow

 from Import  = Dispatch ('excel.application')                    = XlApp.Workbooks.Open (R')  C:\Users\PycharmProjects\windOFapi\1242.xlsx',)       = Xlbook.worksheets ( ' Sheet1 '  = sht. Cells (4, 3). ValuePrint req

The following two OPENPYXL and Pywin32 modules with less, master not much, do not dare to write more.

General processing of tabular data for dry--excel and common python modules.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.