Apply Expanding Function On Dataframe
Solution 1:
An possible solution is to make the expanding
part of the function, and use GroupBy.apply
:
def foo1(_df):
return _df['x1'].expanding().max() * _df['x2'].expanding().apply(lambda x: x[-1], raw=True)
df['foo_result'] = df.groupby('group').apply(foo1).reset_index(level=0, drop=True)
print (df)
group time x1 x2 foo_result
0 A 1 10 1 10.0
3 B 1 100 2 200.0
1 A 2 40 2 80.0
4 B 2 200 0 0.0
2 A 3 30 1 40.0
5 B 3 300 3 900.0
This is not a direct solution to the problem of applying a dataframe function to an expanding
dataframe, but it achieves the same functionality.
Solution 2:
Applying a dataframe function on an expanding
window is apparently not possible (at least for not pandas version 0.23.0), as one can see by plugging a print
statement into the function.
Running df.groupby('group').expanding().apply(lambda x: bool(print(x)) , raw=False)
on the given DataFrame (where the bool
around the print
is just to get a valid return value) returns:
0 1.0
dtype: float64
0 1.0
1 2.0
dtype: float64
0 1.0
1 2.0
2 3.0
dtype: float64
0 10.0
dtype: float64
0 10.0
1 40.0
dtype: float64
0 10.0
1 40.0
2 30.0
dtype: float64
(and so on - and also returns a dataframe with '0.0' in each cell, of course).
This shows that the expanding
window works on a column-by-column basis (we see that first the expanding time
series is printed, then x1
, and so on), and does not really work on a dataframe - so a dataframe function can't be applied to it.
So, to get the obtained functionality, one would have to put the expanding
inside the dataframe function, like in the accepted answer.
Post a Comment for "Apply Expanding Function On Dataframe"