Skip to content Skip to sidebar Skip to footer

Efficent Insertion Of Not Aligned Elements In A Numpy Array

I'm using numpy 1.9 to work on a set of arrays. Assuming I have something like that I have two 2d arrays A and B and a 1-d array C, that looks like that: >>> A array([[ 1.

Solution 1:

How about a nasty one-liner?

First, the data; the arrays have the same shape as yours, but I've used integers to make the example easier to read.

In [81]:AOut[81]:array([[0,1,2,3,4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19],
       [20, 21, 22, 23, 24]])In [82]:BOut[82]:array([[0,100,200,300,400],
       [ 500,  600,  700,  800,  900],
       [1000, 1100, 1200, 1300, 1400],
       [1500, 1600, 1700, 1800, 1900],
       [2000, 2100, 2200, 2300, 2400]])In [83]:COut[83]:array([1,3,2,4,0])

And here's the nasty one-liner:

In [84]: np.insert(A.ravel(), np.ravel_multi_index((range(A.shape[0]), C), A.shape) + 1, B[range(B.shape[0]), C]).reshape(A.shape[0], A.shape[1]+1)
Out[84]: 
array([[   0,    1,  100,    2,    3,    4],
       [   5,    6,    7,    8,  800,    9],
       [  10,   11,   12, 1200,   13,   14],
       [  15,   16,   17,   18,   19, 1900],
       [  20, 2000,   21,   22,   23,   24]])

Here's the broken-down version:

A.ravel() flattens A into a 1-d array, which I'll call F:

In [87]: F = A.ravel()

In [88]: F
Out[88]: 
array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
       17, 18, 19, 20, 21, 22, 23, 24])

(EDIT: It turns out this first step--flattening A--is not necessary. As @hpaulj points out in his answer, np.insert will flatten the array by default.)

np.ravel_multi_index is used to convert the desired 2-d positions into the indices into the flattened array. The + 1 at the end is necessary because you want to insert the elements after the index given in C:

In [89]: insert_indices = np.ravel_multi_index((range(A.shape[0]), C), A.shape) + 1

In [90]: insert_indices
Out[90]: array([ 2,  9, 13, 20, 21])

B[range(B.shape[0]), C] pulls the desired values out of B:

In [91]: values= B[range(B.shape[0]), C]

In [92]: valuesOut[92]: array([ 100,  800, 1200, 1900, 2000])

np.insert does the actual insertion and creates a new array:

In [93]: np.insert(F, insert_indices, values)
Out[93]: 
array([   0,    1,  100,    2,    3,    4,    5,    6,    7,    8,  800,
          9,   10,   11,   12, 1200,   13,   14,   15,   16,   17,   18,
         19, 1900,   20, 2000,   21,   22,   23,   24])

Now just reshape that to get the final result:

In [94]: np.insert(F, insert_indices, values).reshape(A.shape[0], A.shape[1]+1)
Out[94]: 
array([[   0,    1,  100,    2,    3,    4],
       [   5,    6,    7,    8,  800,    9],
       [  10,   11,   12, 1200,   13,   14],
       [  15,   16,   17,   18,   19, 1900],
       [  20, 2000,   21,   22,   23,   24]])

Solution 2:

First, some slightly more legible arrays:

>>> A
array([[ 1.,  1.,  1.,  1.,  1.],
       [ 1.,  1.,  1.,  1.,  1.],
       [ 1.,  1.,  1.,  1.,  1.],
       [ 1.,  1.,  1.,  1.,  1.],
       [ 1.,  1.,  1.,  1.,  1.]])
>>> B
array([[-1., -1., -1., -1., -1.],
       [-1., -1., -1., -1., -1.],
       [-1., -1., -1., -1., -1.],
       [-1., -1., -1., -1., -1.],
       [-1., -1., -1., -1., -1.]])
>>> C
array([1, 3, 2, 4, 0])

Next, some mask shenanigans:

>>>ge_mask = C.reshape(-1, 1) >= numpy.arange(5)>>>eq_mask = C.reshape(-1, 1) == numpy.arange(5)>>>lt_mask = C.reshape(-1, 1) < numpy.arange(5)

And the coup de grâce:

>>>result = numpy.zeros((A.shape[0], A.shape[1] + 1))>>>result[:,0:5][ge_mask] = A[ge_mask]>>>result[:,1:6][eq_mask] = B[eq_mask]>>>result[:,1:6][lt_mask] = A[lt_mask]>>>result
array([[ 1.,  1., -1.,  1.,  1.,  1.],
       [ 1.,  1.,  1.,  1., -1.,  1.],
       [ 1.,  1.,  1., -1.,  1.,  1.],
       [ 1.,  1.,  1.,  1.,  1., -1.],
       [ 1., -1.,  1.,  1.,  1.,  1.]])

Warren's just-posted answer seems like it might be better from a memory perspective. Not sure about speed. (I do think the above is somewhat more legible!)

Solution 3:

I believe this is the corrected iteration:

A=np.arange(25).reshape(5,5)
B=np.arange(25).reshape(5,5)*-1
C=np.array([1,3,2,4,0])

A2=np.zeros((5,6),dtype=int)
for i,c in enumerate(C):
    A2[i,:]=np.insert(A[i],c+1,B[i,c])

producing:

array([[  0,   1,  -1,   2,   3,   4],
       [  5,   6,   7,   8,  -8,   9],
       [ 10,  11,  12, -12,  13,  14],
       [ 15,  16,  17,  18,  19, -19],
       [ 20, -20,  21,  22,  23,  24]])

This can be turned into a one liner as:

 np.array([np.insert(a, c+1, b[c]) for  a,b,c in zip(A,B,C)])

The equivalent terms in Warren's answer are:

c<=>c= np.ravel_multi_index((range(5), C),(5,5))
b <=> B.ravel()[c]
np.insert(A,c+1, B.ravel()[c]).reshape(5,6)

np.insert ravels A as a default. For this small example, this ravel_multi_index is 2x faster than the row iteration.

Post a Comment for "Efficent Insertion Of Not Aligned Elements In A Numpy Array"