Efficent Insertion Of Not Aligned Elements In A Numpy Array
Solution 1:
How about a nasty one-liner?
First, the data; the arrays have the same shape as yours, but I've used integers to make the example easier to read.
In [81]:AOut[81]:array([[0,1,2,3,4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19],
[20, 21, 22, 23, 24]])In [82]:BOut[82]:array([[0,100,200,300,400],
[ 500, 600, 700, 800, 900],
[1000, 1100, 1200, 1300, 1400],
[1500, 1600, 1700, 1800, 1900],
[2000, 2100, 2200, 2300, 2400]])In [83]:COut[83]:array([1,3,2,4,0])
And here's the nasty one-liner:
In [84]: np.insert(A.ravel(), np.ravel_multi_index((range(A.shape[0]), C), A.shape) + 1, B[range(B.shape[0]), C]).reshape(A.shape[0], A.shape[1]+1)
Out[84]:
array([[ 0, 1, 100, 2, 3, 4],
[ 5, 6, 7, 8, 800, 9],
[ 10, 11, 12, 1200, 13, 14],
[ 15, 16, 17, 18, 19, 1900],
[ 20, 2000, 21, 22, 23, 24]])
Here's the broken-down version:
A.ravel()
flattens A
into a 1-d array, which I'll call F
:
In [87]: F = A.ravel()
In [88]: F
Out[88]:
array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,
17, 18, 19, 20, 21, 22, 23, 24])
(EDIT: It turns out this first step--flattening A
--is not necessary. As @hpaulj points out in his answer, np.insert
will flatten the array by default.)
np.ravel_multi_index
is used to convert the desired 2-d positions into the indices into the flattened array. The + 1
at the end is necessary because you want to insert the elements after the index given in C
:
In [89]: insert_indices = np.ravel_multi_index((range(A.shape[0]), C), A.shape) + 1
In [90]: insert_indices
Out[90]: array([ 2, 9, 13, 20, 21])
B[range(B.shape[0]), C]
pulls the desired values out of B
:
In [91]: values= B[range(B.shape[0]), C]
In [92]: valuesOut[92]: array([ 100, 800, 1200, 1900, 2000])
np.insert
does the actual insertion and creates a new array:
In [93]: np.insert(F, insert_indices, values)
Out[93]:
array([ 0, 1, 100, 2, 3, 4, 5, 6, 7, 8, 800,
9, 10, 11, 12, 1200, 13, 14, 15, 16, 17, 18,
19, 1900, 20, 2000, 21, 22, 23, 24])
Now just reshape that to get the final result:
In [94]: np.insert(F, insert_indices, values).reshape(A.shape[0], A.shape[1]+1)
Out[94]:
array([[ 0, 1, 100, 2, 3, 4],
[ 5, 6, 7, 8, 800, 9],
[ 10, 11, 12, 1200, 13, 14],
[ 15, 16, 17, 18, 19, 1900],
[ 20, 2000, 21, 22, 23, 24]])
Solution 2:
First, some slightly more legible arrays:
>>> A
array([[ 1., 1., 1., 1., 1.],
[ 1., 1., 1., 1., 1.],
[ 1., 1., 1., 1., 1.],
[ 1., 1., 1., 1., 1.],
[ 1., 1., 1., 1., 1.]])
>>> B
array([[-1., -1., -1., -1., -1.],
[-1., -1., -1., -1., -1.],
[-1., -1., -1., -1., -1.],
[-1., -1., -1., -1., -1.],
[-1., -1., -1., -1., -1.]])
>>> C
array([1, 3, 2, 4, 0])
Next, some mask shenanigans:
>>>ge_mask = C.reshape(-1, 1) >= numpy.arange(5)>>>eq_mask = C.reshape(-1, 1) == numpy.arange(5)>>>lt_mask = C.reshape(-1, 1) < numpy.arange(5)
And the coup de grâce:
>>>result = numpy.zeros((A.shape[0], A.shape[1] + 1))>>>result[:,0:5][ge_mask] = A[ge_mask]>>>result[:,1:6][eq_mask] = B[eq_mask]>>>result[:,1:6][lt_mask] = A[lt_mask]>>>result
array([[ 1., 1., -1., 1., 1., 1.],
[ 1., 1., 1., 1., -1., 1.],
[ 1., 1., 1., -1., 1., 1.],
[ 1., 1., 1., 1., 1., -1.],
[ 1., -1., 1., 1., 1., 1.]])
Warren's just-posted answer seems like it might be better from a memory perspective. Not sure about speed. (I do think the above is somewhat more legible!)
Solution 3:
I believe this is the corrected iteration:
A=np.arange(25).reshape(5,5)
B=np.arange(25).reshape(5,5)*-1
C=np.array([1,3,2,4,0])
A2=np.zeros((5,6),dtype=int)
for i,c in enumerate(C):
A2[i,:]=np.insert(A[i],c+1,B[i,c])
producing:
array([[ 0, 1, -1, 2, 3, 4],
[ 5, 6, 7, 8, -8, 9],
[ 10, 11, 12, -12, 13, 14],
[ 15, 16, 17, 18, 19, -19],
[ 20, -20, 21, 22, 23, 24]])
This can be turned into a one liner as:
np.array([np.insert(a, c+1, b[c]) for a,b,c in zip(A,B,C)])
The equivalent terms in Warren's answer are:
c<=>c= np.ravel_multi_index((range(5), C),(5,5))
b <=> B.ravel()[c]
np.insert(A,c+1, B.ravel()[c]).reshape(5,6)
np.insert
ravels A
as a default. For this small example, this ravel_multi_index
is 2x faster than the row iteration.
Post a Comment for "Efficent Insertion Of Not Aligned Elements In A Numpy Array"