Skip to content Skip to sidebar Skip to footer

Pandas: How To Designate Starting Row To Extract Data

I am using Pandas library and Python. I have an Excel file that has some heading information on the top of an Excel sheet which I do not need for data extraction. But, the heading

Solution 1:

You could manually check for the header line and then use read_csvs keyword argument skiprows.

withopen('data.csv') as fp:
    skip = next(filter(
        lambda x: x[1].startswith('ID'),
        enumerate(fp)
    ))[0]

Then skip the rows:

df = pandas.read_csv('data.csv', skiprows=skip)

Like that you can support pre-header sections of arbitrary length.


For Python 2:

import itertools as it

withopen('data.csv') as fp:
    skip = next(it.ifilter(
        lambda x: x[1].startswith('ID'),
        enumerate(fp)
    ))[0]

Solution 2:

You can use pd.read_csv and specify skiprows=4:

df = pd.read_csv('test.csv', skiprows=4)

Post a Comment for "Pandas: How To Designate Starting Row To Extract Data"