Apply Expanding Function On Dataframe

August 15, 2022 Post a Comment

I have a function that I wish to apply to a subsets of a pandas DataFrame, so that the function is calculated on all rows (until current row) from the same group - i.e. using a gro

Solution 1:

An possible solution is to make the expanding part of the function, and use GroupBy.apply:

def foo1(_df):
    return _df['x1'].expanding().max() * _df['x2'].expanding().apply(lambda x: x[-1], raw=True)

df['foo_result'] = df.groupby('group').apply(foo1).reset_index(level=0, drop=True)
print (df)
  group  time   x1  x2  foo_result
0     A     1   10   1        10.0
3     B     1  100   2       200.0
1     A     2   40   2        80.0
4     B     2  200   0         0.0
2     A     3   30   1        40.0
5     B     3  300   3       900.0

This is not a direct solution to the problem of applying a dataframe function to an expanding dataframe, but it achieves the same functionality.

Solution 2:

Applying a dataframe function on an expanding window is apparently not possible (at least for not pandas version 0.23.0), as one can see by plugging a print statement into the function.

Running df.groupby('group').expanding().apply(lambda x: bool(print(x)) , raw=False) on the given DataFrame (where the bool around the print is just to get a valid return value) returns:

0    1.0
dtype: float64
0    1.0
1    2.0
dtype: float64
0    1.0
1    2.0
2    3.0
dtype: float64
0    10.0
dtype: float64
0    10.0
1    40.0
dtype: float64
0    10.0
1    40.0
2    30.0
dtype: float64

(and so on - and also returns a dataframe with '0.0' in each cell, of course).

This shows that the expanding window works on a column-by-column basis (we see that first the expanding time series is printed, then x1, and so on), and does not really work on a dataframe - so a dataframe function can't be applied to it.

So, to get the obtained functionality, one would have to put the expanding inside the dataframe function, like in the accepted answer.

Python Developer

Apply Expanding Function On Dataframe

Solution 1:

Solution 2:

Post a Comment for "Apply Expanding Function On Dataframe"