How To Format Output Using Pandas And Data Frames
Have huge file with 500k records (input) with similar pattern input.txt: p m a b 1 0 10 100 1 1 11 111 1 2 12 122 2 0 20 200 2 1 21 211 2 2 22 222 Can you please let me know, how
Solution 1:
df2 = df.pivot(index=['p'], columns='m', values=['a', 'b']).stack(0).reset_index()
df2 = df2.rename(columns={'level_1': 'r'})
df2 = df2.sort_values(['r', 'p']).reset_index(drop=True)
m p r 01201 a 10111212 a 20212221 b 10011112232 b 200211222
Solution 2:
You can use .stack()
and .unstack()
, as follows:
(df.set_index(['p', 'm'])
.stack()
.unstack(level=1)
.reset_index(level=0)
.rename_axis(index='r', columns=None)
.reset_index()
.sort_values('r', ignore_index=True)
)
or use .pivot()
, as follows:
(df.pivot(index='p', columns='m', values=['a', 'b'])
.stack(level=0)
.reset_index(level=0)
.rename_axis(index='r', columns=None)
.reset_index()
.sort_values('r', ignore_index=True)
)
Result:
r p0120a11011121a22021222b11001111223b2200211222
Post a Comment for "How To Format Output Using Pandas And Data Frames"