Skip to content Skip to sidebar Skip to footer

DictReader For Excel-Files

I have a file that I currently save to csv but it's originally an Excel-file (Excel 2010). Its content is of this sort: Name;Category;Address McFood;Fast Food;Street 1 BurgerEmpero

Solution 1:

I created a gist for an openpyxl implementation

Here it is, repeated for convenience:

from openpyxl import load_workbook

def xlsx_dictreader(
    filename,
    sheet_index=0,
    header_row_index=1,
    data_start_row_index=2,
    data_only=True,
    post_process_funcs=None,
    null_vals = [None, 'None']
):
    book = load_workbook(filename, data_only=data_only)
    sheet = book.worksheets[sheet_index]
    header = [c for c in (cell.value for cell in sheet[header_row_index]) if c not in null_vals]
    if not post_process_funcs:
        y = lambda x:x
        post_process_funcs = [y] * len(header)
    else:
        if len(post_process_funcs) != len(header):
            raise Exception('post-processing functions do not line up with headers')
    for row_idx in range(data_start_row_index, sheet.max_row):
        candidate = {
            header[col_idx - 1]: post_process_funcs[col_idx - 1](sheet.cell(row=row_idx, column=col_idx).value)
            for col_idx in range(1, sheet.max_column)
            if col_idx < len(header)
        }
        if not all(value in null_vals for value in candidate.values()):
            yield(candidate)

There is more detail avalable in the gist comments


Post a Comment for "DictReader For Excel-Files"