Pandas - Conditional Drop Duplicates
I have a Pandas 0.19.2 dataframe for Python 3.6x as below. I want to drop_duplicates() with the same Id based on a conditional logic. import pandas as pd import numpy as np np.rand
Solution 1:
Use GroupBy.transform
for aggregated values with same size as original DataFrame with sort_values
and drop_duplicates
for remove dupes:
df['Size'] = df.groupby('Id')['Size'].transform('sum')
df = df.sort_values('Age').drop_duplicates('Id', keep='last').sort_index()
print (df)
Id Name Size Age
1 2 B 0.812663 25
3 4 D 0.302333 31
4 3 E 0.146870 43
6 6 G 0.186260 44
7 7 H 0.345561 20
8 1 I 0.813789 51
9 8 K 0.538817 31
Post a Comment for "Pandas - Conditional Drop Duplicates"