Skip to content Skip to sidebar Skip to footer

How To Format Output Using Pandas And Data Frames

Have huge file with 500k records (input) with similar pattern input.txt: p m a b 1 0 10 100 1 1 11 111 1 2 12 122 2 0 20 200 2 1 21 211 2 2 22 222 Can you please let me know, how

Solution 1:

df2 = df.pivot(index=['p'], columns='m', values=['a', 'b']).stack(0).reset_index()
df2 = df2.rename(columns={'level_1': 'r'})
df2 = df2.sort_values(['r', 'p']).reset_index(drop=True)

m  p  r    01201  a   10111212  a   20212221  b  10011112232  b  200211222

Solution 2:

You can use .stack() and .unstack(), as follows:

(df.set_index(['p', 'm'])
   .stack()
   .unstack(level=1)
   .reset_index(level=0)
   .rename_axis(index='r', columns=None)
   .reset_index()
   .sort_values('r', ignore_index=True)
)   

or use .pivot(), as follows:

(df.pivot(index='p', columns='m', values=['a', 'b'])
   .stack(level=0)
   .reset_index(level=0)
   .rename_axis(index='r', columns=None)
   .reset_index()
   .sort_values('r', ignore_index=True)
)

Result:

   r  p0120a11011121a22021222b11001111223b2200211222

Post a Comment for "How To Format Output Using Pandas And Data Frames"