Detailed instructions on using xlrd and xlwt to Operate excel tables in python, xlrdxlwt
Recently, I encountered a situation where I generated and sent server usage reports on a regular basis. According to statistics from different dimensions, I used python to perform excel operations. I collected some online reports, most of which were similar, however, after some research on the source code (I felt quite fulfilled with this word), it basically solved the daily needs. It mainly records the common problems and solutions during use.
Python excel operations mainly use xlrd and xlwt databases, that is, xlrd reads excel and xlwt writes excel. Download https://pypi.python.org/pypifrom here. The following is a record of reading and writing python excel.
Read excel in python -- xlrd
There are several troublesome issues in this process, such as reading the date and reading the merged cell content. Next let's take a look at the basic operations:
First, read an excel file with two sheets. The second sheet is used for testing. The content of sheet2 is as follows:
Python performs the following operations on excel:
#-*-Coding: UTF-8-*-import xlrdimport xlwtfrom datetime import date, datetimedef read_excel (): # open the file workbook = xlrd. open_workbook (r 'f: \ demo.xlsx') # obtain all sheet print workbooks. sheet_names () # [u 'sheet1', u 'sheet2'] sheet2_name = workbook. sheet_names () [1] # obtain the sheet content sheet2 = workbook Based on the sheet index or name. sheet_by_index (1) # sheet Index from 0 sheet2 = workbook. sheet_by_name ('sheet2') # sheet Name, number of rows, number of columns print sheet2.name, sheet2.nrows, sheet2.ncols # Get the value of the whole row and the whole column (array) rows = sheet2.row _ values) # obtain the fourth row of content cols = sheet2.col _ values (2) # obtain the third column of content print rows print cols # obtain the cell content print sheet2.cell ). value. encode ('utf-8') print sheet2.cell _ value (1, 0 ). encode ('utf-8') print sheet2.row (1) [0]. value. encode ('utf-8') # obtain the data type of the cell content, print sheet2.cell ). ctypeif _ name _ = '_ main _': read_excel ()
The running result is as follows:
The problem arises. In the preceding running result, the field in the red box is clearly the date of birth, and the real floating point number is displayed. Okay, to solve the first problem:
1. python reads the cell content in excel as a date
Python reads the content of cells in excel and returns 5 types, that is, the ctype in the above example:
ctype : 0 empty,1 string, 2 number, 3 date, 4 boolean, 5 error
That is, the ctype of date is 3. In this case, you must use xldate_as_tuple of xlrd to process the data in the date format. First, you must judge that the ctype of the table is 3 before the xldate operation starts. Now let's look at the command line:
>>> Sheet2.cell (2, 2 ). ctype #1990/2/22 >>> sheet2.cell (2, 1 ). ctype #24 >>> sheet2.cell (2, 0 ). ctype # chubby >>> sheet2.cell (2, 4 ). ctype # null value (here is the reason for merging cells) >>> sheet2.cell ). value #1990/2/2233656.0 >>> xlrd. xldate_as_tuple (sheet2.cell _ value (2, 2), workbook. datemode) (1992, 2, 22, 0, 0, 0) >>> date_value = xlrd. xldate_as_tuple (sheet2.cell _ value (2, 2), workbook. datemode) >>> date_value (1992, 2, 22, 0, 0) >>> date (* date_value [: 3]) datetime. date (1992, 2, 22) >>> date (* date_value [: 3]). strftime ('% Y/% m/% D') '2017/22'
You can perform a simple process to determine whether the ctype is equal to 3. If the ctype is equal to 3, the time format is used for processing:
if (sheet.cell(row,col).ctype == 3): date_value = xlrd.xldate_as_tuple(sheet.cell_value(rows,3),book.datemode) date_tmp = date(*date_value[:3]).strftime('%Y/%m/%d')
The problem arises again, as shown in sheet2.cell (2, 4) above ). the value returned by ctype is 0, indicating that the value of this cell is null, which is clearly the content of the merged cell "good friend". I think this package is not fully functional, if it is a merged cell, the content of the merged cell is the same, but it is only the value of the first cell to be merged, and others are empty.
>>> Sheet2.col _ values (4) [U' \ u5173 \ u7cfb ', U' \ u597d \ u670b \ u53cb', '', U' \ u540c \ u5b66 ', '','', U' \ u4e00 \ u4e2a \ u4eba ', ''] >>> for I in range (sheet2.nrows): print sheet2.col _ values (4) [I] relationship with friends and classmates> sheet2.row _ values (7) [U' \ u65e0 \ u540d ', 20.0, U' \ u6682 \ u65e0 ','', ''] >>> for I in range (sheet2.ncols): print sheet2.row _ values (7) [I] unknown 20.0 not found >>>
2. Read the merged cell content
This is really no trick. Only the row and column indexes of the first cell in the merged cell can be obtained to read the value. If the read error occurs, the value is null.
That is, merge row cells to read the first index of the row, and merge column cells to read the first index of the column, as shown in the preceding figure, read row merge cell "good friend" and read Column merge cell "no" can only be as follows:
>>> Print sheet2.col _ values (4) [1] Friends >>> print sheet2.row _ values (7) [2]> >>> sheet2.merged _ cells # why is there a merged cell? []
Again, the merge cells may have null values, but normal cells in the table may also have null values. How can we get the so-called "index of the first row or column" of the cell?
This requires you to first know which cells are merged!
3. Obtain the merged cells.
When reading files, you must set the formatting_info parameter to True. The default value is False. Therefore, the array of cells obtained above is empty,
>>> workbook = xlrd.open_workbook(r'F:\demo.xlsx',formatting_info=True)>>> sheet2 = workbook.sheet_by_name('sheet2')>>> sheet2.merged_cells[(7, 8, 2, 5), (1, 3, 4, 5), (3, 6, 4, 5)]
The meanings of the four parameters returned by merged_cells are: (row, row_range, col, col_range), where [row, row_range) includes row, excluding row_range, and col, that is, (1, the meaning of 3, 4, 5) is: 1st to 2 rows (excluding 3) Merge, (7, 8, 2, 5) meaning: 2nd to 4 columns merge.
With this, you can obtain the content of the merged three cells:
>>> Print sheet2.cell _ value () # (1, 3, 4, 5) good friend >>> print sheet2.cell _ value () # (3, 6, 4, 5) Students >>> print sheet2.cell _ value () # (7, 8, 2, 5) None
Have you found the rule? Yes. You can obtain the row and col low-level indexes returned by merge_cells! So we can do this once and for all:
>>> Merge = [] >>> for (rlow, rhweigh, clow, chweigh) in sheet2.merged _ cells: merge. append ([rlow, clow]) >>> merge [[7, 2], [1, 4], [3, 4] >>> for index in merge: print sheet2.cell _ value (index [0], index [1]) has no friends >>>
Write excel in python -- xlwt
The difficulty of writing an excel file may not be the construction of a workbook, but the data to be filled, but this is not within the scope. There are also some difficult problems in writing excel files. For example, it is troublesome to write merged cells. In addition, there are different writing styles. The source code can be thoroughly studied.
I "conceived" The following sheet1, which is implemented using xlwt:
Basically, it seems complicated, and it looks "very formal". It is entirely an individual.
The Code is as follows:
'''Set cell styles ''' def set_style (name, height, bold = False): style = xlwt. XFStyle () # initialize the style font = xlwt. font () # create a font for the style. name = name # 'times New Roman 'font. bold = bold font. color_index = 4 font. height = height # borders = xlwt. borders () # borders. left = 6 # borders. right = 6 # borders. top = 6 # borders. bottom = 6 style. font = font # style. borders = borders return style # Write exceldef write_excel (): f = xlwt. workbook () # create a Workbook ''' create the first sheet: sheet1 ''' sheet1 = f. add_sheet (u 'sheet1', cell_overwrite_ OK = True) # create sheet row0 = [u 'business', u 'status', u 'beijing', u 'shanghai ', u 'guangzhou ', u 'shenzhen', u 'State subtotal ', u 'Total'] column0 = [u 'air way', u 'waybill', u 'train ticket ', u 'bus pass', u 'others'] status = [u 'prepa', u 'output', u 'unput ', u'business subtotal '] # generate the first line for I in range (0, len (row0): sheet1.write (0, I, row0 [I], set_style ('times New Roman ', 220, True) # generate the first and last columns (merge 4 rows) I, j = 1, 0 while I <4 * len (column0) and j <len (column0): sheet1.write _ merge (I, I + 3, 0, 0, column0 [j], set_style ('arial', 220, True) # first column sheet1.write _ merge (I, I +, 7) # The last column "Total" I + = 4 j + = 1 sheet1.write _ merge (220, U' ', set_style ('times New Roman, true) # generate the second column I = 0 while I <4 * len (column0): for j in range (0, len (status )): sheet1.write (j + I +, status [j]) I + = 4 f.save('demo1.xlsx') # Save the file if _ name _ = '_ main __': # generate_workbook () # read_excel () write_excel ()
The write_merge method must be explained in a bit:
Write_merge (x, x + m, y, w + n, string, sytle)
X indicates the row, y indicates the column, m indicates the number of cross-row, n indicates the number of cross-column, string indicates the content of the cell to be written, and style indicates the cell style. Where, x, y, w, and h are all calculated starting with 0.
This is not the same as the read and merge cells in xlrd.
As shown above: sheet1.write _ merge (220, u'total', set_style ('times New Roman ', True ))
That is, column 1 and column 2 are merged in Row 22. The merged cell content is "Total" and the style is set.
If you want to create multiple sheets, you only need to create f. add_sheet.
For example, create another sheet2 before f.save('demo1.xlsx 'in write_excel. The effect is as follows:
The code is really easy:
''' Create the second sheet: sheet2 ''' sheet2 = f. add_sheet (u 'sheet2', cell_overwrite_ OK = True) # create sheet2 row0 = [u 'name', u 'age', u 'birthdate ', u 'hobbies ', u 'relational '] column0 = [u 'xiaojie', u 'chubby', u 'xiaoming ', u 'great God', u 'Big fairy', u 'Min ', u'nameless '] # generate the first line for I in range (0, len (row0): sheet2.write (0, I, row0 [I], set_style ('times New Roman ', 220, True) # generate the first column for I in range (0, len (column0): sheet2.write (I +, column0 [I], set_style ('times New Roman ', 220) sheet2.write (1991/11, '192/11') sheet2.write _ merge (, 2, 4, u' none ') # merge Column cell sheet2.write _ merge (, u'friends ') # merge row cell f.save('demo1.xlsx') # Save the file