Read Excel XLS spreadsheets with Python
Someone mailed out information to a club I'm in as an .XLS file. Another Excel spreadsheet. Sigh.
I do know one way to read them. Fire up OpenOffice, listen to my CPU fan spin as I wait forever for the app to start up, open the xls file, then click in one cell after another as I deal with the fact that spreadsheet programs only show you a tiny part of the text in each cell. I'm not against spreadsheets per se -- they're great for calculating tables of interconnected numbers -- but they're a terrible way to read tabular data.
Over the years, lots of open-source programs like word2x and catdoc have sprung up to read the text in MS Word .doc files. Surely by now there must be something like that for XLS files?
Well, I didn't find any ready-made programs, but I found something better: Python's xlrd module, as well as a nice clear example at ScienceOSS of how to Read Excel files from Python.
Following that example, in six lines I had a simple program to print the spreadsheet's contents:
import xlrd for filename in sys.argv[1:] : wb = xlrd.open_workbook(filename) for sheetname in wb.sheet_names() : sh = wb.sheet_by_name(sheetname) for rownum in range(sh.nrows) : print sh.row_values(rownum)
Of course, having gotten that far, I wanted better formatting so I could compare the values in the spreadsheet. Didn't take long to write, and the whole thing still came out under 40 lines: xlsrd. And I was able to read that XLS file that was mailed to the club, easily and without hassle.
I'm forever amazed at all the wonderful, easy-to-use modules there are for Python.
[ 10:58 Aug 31, 2011 More programming | permalink to this entry | ]