Thursday, May 23, 2013

Reading and Writing Excel Workbooks with Python

Lots of data is stored in Excel workbooks, so it's convenient to be able to read that. Workbooks can also be a great way of delivering results, since they make it easy for your clients to perform ad hoc checks. The PyWin32 (formerly win32all) module provides the ability to create and access many different types of Windows files on a Windows machine. But if you want to work with Excel files without Windows, you can use the Python Excel modules xlrd, xlwt, and xlutil for reading, writing, and filtering Excel files. Below we'll show simple examples of reading and writing Excel workbooks with xlrt and xlwt.

The packages have an excellent tutorial pdf, and the examples below are just a subset of what's shown there. Maybe the most interesting part of the tutorial is the last page, which shows the power of the packages by suggesting exercises an instructor might use for a workshop on the packages, including:
  • inserting a row into a worksheet
  • splitting a workbook into one file per worksheet
  • scanning a directory and reporting the location of error cells
Let's start by reading a workbook using the open_workbook from xlrd. For this workbook, we know there are 3 worksheets.

>>> # get to directory with workbook
>>> import os
>>> os.chdir("~/blog") 
>>> os.chddir("xlutils")
>>> # open workbook
>>> from xlrd import open_workbook 
>>> book = open_workbook("workbook.xls")
>>> print book.nsheets # check number of worksheets
3
>>> print book.sheet_names() # check sheet names
[u'letters', u'numbers', u'Sheet3']

And we can obtain the first worksheet using the sheet_by_index method on the workbook.

>>> sheet = book.sheet_by_index(0) # get first worksheet
>>> sheet.name # check name
u'letters'
>>> sheet.nrows # get number of rows
3
>>> sheet.ncols # and cols
1
Finally the values in the worksheet can be obtained with the workbook's cell method. Using that method in a list comprehension with the numbers of rows and columns already obtained we load the worksheet into a list.

>>> val = sheet.cell(0, 0)
>>> val.value
u'a'
>>> valL = [sheet.cell(i, 0).value for i in range(sheet.nrows)]
>>> valL
[u'a', u'b', u'c']
Writing a workbook is even simpler. First we open a workbook with the xlwt Workbook, and obtain a worksheet with the workbook's add_sheet method. On the sheet, we can write values with the write method. The arguments for the write method are (row, column, value). The row and column use 0 offset counting, so row 0 and column 0 are A1. Finally to save the workbook, we use the workbook's save method. 

>>> wb = Workbook() # obtain a new workbook
>>> ws = wb.add_sheet('s0') # obtain a new worksheet named s0
>>> ws.write(0, 0, '1') # write a '1' in the top left cell
>>> ws.write(0, 1, 1) # write a 1 in B1
>>> wb.save('workbook2.xls') # save workbook