Fulfill An Empty Dataframe With Common Index Values From Another Daframe
Solution 1:
I don't think you need a second dataframe. If you call resample
without a fill_method
, it will store NaN
s for the missing periods:
df.resample("s").max()Out[62]:c1c2time2013-01-01 00:00:01 5.03.02013-01-01 00:00:02 NaNNaN2013-01-01 00:00:03 7.02.02013-01-01 00:00:04 1.05.02013-01-01 00:00:05 4.03.02013-01-01 00:00:06 5.06.02013-01-01 00:00:07 NaNNaN2013-01-01 00:00:08 NaNNaN2013-01-01 00:00:09 4.02.02013-01-01 00:00:10 7.08.0
max()
here is just an arbitrary method so that it returns a dataframe. You can replace it with mean, min etc. assuming you have no duplicates. If you have duplicates, they will be aggregated by that function.
As Paul H suggested in the comments, you can use df.resample("s").asfreq()
without any aggregation. It skips an unnecessary step of aggregation so it is probably more efficient. It will raise an error if you have duplicate values in the index.
Solution 2:
You need to reindex
the dataframe.
import pandas
df = pandas.read_table(filename, **options)
N = 86400 * 31 #seconds per month
dates = pandas.date_range(df.index[0], periods=N-1, freq='1s')
df = df.reindex(dates)
Here's a reproducible demonstration:
df=pandas.DataFrame(data={'A':range(0,10),'B':range(0,20,2)},index=pandas.date_range('2012-01-01',freq='2s',periods=10)).reindex(pandas.date_range('2012-01-01',freq='1s',periods=25))print(df)AB2012-01-01 00:00:00 0.00.02012-01-01 00:00:01 NaNNaN2012-01-01 00:00:02 1.02.02012-01-01 00:00:03 NaNNaN2012-01-01 00:00:04 2.04.02012-01-01 00:00:05 NaNNaN2012-01-01 00:00:06 3.06.02012-01-01 00:00:07 NaNNaN2012-01-01 00:00:08 4.08.02012-01-01 00:00:09 NaNNaN2012-01-01 00:00:10 5.010.02012-01-01 00:00:11 NaNNaN2012-01-01 00:00:12 6.012.02012-01-01 00:00:13 NaNNaN2012-01-01 00:00:14 7.014.02012-01-01 00:00:15 NaNNaN2012-01-01 00:00:16 8.016.02012-01-01 00:00:17 NaNNaN2012-01-01 00:00:18 9.018.02012-01-01 00:00:19 NaNNaN2012-01-01 00:00:20 NaNNaN2012-01-01 00:00:21 NaNNaN2012-01-01 00:00:22 NaNNaN2012-01-01 00:00:23 NaNNaN2012-01-01 00:00:24 NaNNaN
Solution 3:
If you already set up the indexes in the "nan" data frame, I think you should be able to just use loc
. Indexing is a really important thing to master when using Pandas. It will save you a whole lot of time, make your code a lot cleaner and can really improve your performance.
Careful though, the indexes and columns have to be the same for the trick below to work as is.
>>> import pandas as pd
>>> import numpy as np
>>> df1 = pd.DataFrame(np.random.rand(10, 3), columns=['A', 'B', 'C'])
>>> df1
A B C
00.1715020.2584160.11832610.2154560.4621220.85817320.3735490.9464000.57984530.6062890.2895520.47365840.8858990.7837470.08997550.6742080.6397100.10564260.4047750.5413890.26810170.3746090.6939160.74357580.0747730.1500720.13555590.2304310.2024170.466538>>> df2 = pd.DataFrame(np.nan, index=range(15), columns=['A', 'B', 'C'])
>>> df2
A B C
0 NaN NaN NaN
1 NaN NaN NaN
2 NaN NaN NaN
3 NaN NaN NaN
4 NaN NaN NaN
5 NaN NaN NaN
6 NaN NaN NaN
7 NaN NaN NaN
8 NaN NaN NaN
9 NaN NaN NaN
10 NaN NaN NaN
11 NaN NaN NaN
12 NaN NaN NaN
13 NaN NaN NaN
14 NaN NaN NaN
>>> df2.loc[df1.index] = df1 # This is where the magic happens>>> df2
A B C
00.1715020.2584160.11832610.2154560.4621220.85817320.3735490.9464000.57984530.6062890.2895520.47365840.8858990.7837470.08997550.6742080.6397100.10564260.4047750.5413890.26810170.3746090.6939160.74357580.0747730.1500720.13555590.2304310.2024170.46653810 NaN NaN NaN
11 NaN NaN NaN
12 NaN NaN NaN
13 NaN NaN NaN
14 NaN NaN NaN
Post a Comment for "Fulfill An Empty Dataframe With Common Index Values From Another Daframe"