Skip to content Skip to sidebar Skip to footer

Python: Iterate Over A Data Frame Column, Check For A Condition-value Stored In Array, And Get The Values To A List

After some help in the forum I managed to do what I was looking for and now I need to get to the next level. ( the long explanation is here: Python Data Frame: cumulative sum of co

Solution 1:

A quick way to do would be to leverage NumPy's broadcasting techniques as an extension of this answer from the same post linked, although an answer related to the use of DF.where was actually asked.

Broadcasting eliminates the need to iterate through every element of the array and it's highly efficient at the same time.

The only addition to this post is the use of np.argmax to grab the indices of the first True instance along each column (traversing ↓ direction).

conditions = np.array([10, 15, 23])
tol = 0num_albums = df.Num_Albums.values
num_albums_cumsum = df.Num_Albums.cumsum().values
slices = np.argmax(np.isclose(num_albums_cumsum[:, None], conditions, atol=tol), axis=0)

Retrieved slices:

slices
Out[692]:
array([0, 2, 4], dtype=int64)

Corresponding array produced:

num_albums[slices]
Out[693]:
array([10,  4,  1], dtype=int64)

If you still prefer using DF.where, here is another solution using list-comprehension -

[df.where((df['cumsum'] >= cond - tol) & (df['cumsum'] <= cond + tol), -1)['Num_Albums']
   .max() for cond in conditions]
Out[695]:
[10, 4, 1]

The conditions not fulfilling the given criteria would be replaced by -1. Doing this way preserves the dtype at the end.

Solution 2:

well the output not always be 1 number right? in case the ouput is exact 1 number you can write this code

tol = 0
#condition
c = [5,15,25]
value = []

for i in c:
    iflen(df.where((df['a'] >= i - tol) & (df['a'] <= i + tol)).dropna()['a']) > 0:
        value = value + [df.where((df['a'] >= i - tol) & (df['a'] <= i + tol)).dropna()['a'].values[0]]
    else:
        value = value + [[]]print(value)

the output should be like

[1,2,3]

in case the output can be multiple number and want to be like this

[[1.0, 5.0], [12.0, 15.0], [25.0]]

you can use this code

tol = 5
c = [5,15,25]
value = []

for i in c:
    getdatas = df.where((df['a'] >= i - tol) & (df['a'] <= i + tol)).dropna()['a'].values
    value.append([x for x in getdatas])
print(value)

Post a Comment for "Python: Iterate Over A Data Frame Column, Check For A Condition-value Stored In Array, And Get The Values To A List"