Skip to content Skip to sidebar Skip to footer

Reduce Memory Usage When Running Numpy Array Operations

I have a fairly large NumPy array that I need to perform an operation on but when I do so, my ~2GB array requires ~30GB of RAM in order to perform the operation. I've read that Num

Solution 1:

By default operations like multiplication, addition and many others... you can use numpy.multiply, numpy.add and use out parameter to use existing array for storing result. That will significantly reduce the memory usage. Please see the demo below and translate you code to use those functions instead

arr = np.random.rand(100)
arr2 = np.random.rand(100)

arr3 = np.subtract(arr, 100, out=arr)
arr4 = arr+100
arr5 = np.add(arr, arr2, out=arr2)
arr6 = arr+arr2

print(arr is arr3) # Trueprint(arr is arr4) # Falseprint(arr2 is arr5) # Trueprint(arr2 is arr6) # False

Solution 2:

You could use eg. Numba or Cython to reduce memory usage. Of course a simple Python loop would also be possible, but very slow.

With allocated output array

import numpy as np
import numba as nb

@nb.njit()
def optimise(data):
    data_scaled_offset=np.empty_like(data)
    # Inversely apply scale and scale and offset forthis product
    for i in range(data.shape[0]):
        for j in range(data.shape[1]):
            data_scaled_offset[i,j] = np.round_((((data[i,j] - 1000) *(1 / 1)) + 1),0)

    return data_scaled_offset

In-Place

@nb.njit()
def optimise_in_place(data):
    # Inversely apply scale and scale and offset forthis product
    for i in range(data.shape[0]):
        for j in range(data.shape[1]):
            data[i,j] = np.round_((((data[i,j] - 1000) *(1 / 1)) + 1),0)

    returndata

Post a Comment for "Reduce Memory Usage When Running Numpy Array Operations"