Skip to content Skip to sidebar Skip to footer

How To Avoid An Empty Result With `bag.take(n)` When Using Dask?

Context: Dask documentation states clearly that Bag.take() will only collect from the first partition. However, when using a filter it can occur that the first partition is empty,

Solution 1:

You could do something like the following:

from toolz import take
f = lambda seq: list(take(n, seq))
b.reduction(f, f)

This grabs the first n elements of each partition, collects them all together, and then takes the first n elements of the result.

Post a Comment for "How To Avoid An Empty Result With `bag.take(n)` When Using Dask?"