Division Of Sparse Matrix
I have a scipy.sparse matrix with 45671x45671 elements. In this matrix, some rows contain only '0' value. My question is, how to divide each row values by the row sum. Obviously, w
Solution 1:
I have an M
hanging around:
In [241]: M
Out[241]:
<6x3 sparse matrix of type'<class 'numpy.uint8'>'
with 6 stored elements in Compressed Sparse Row format>
In [242]: M.A
Out[242]:
array([[1, 0, 0],
[0, 1, 0],
[0, 0, 1],
[0, 1, 0],
[0, 0, 1],
[1, 0, 0]], dtype=uint8)
In [243]: M.sum(1) # dense matrix
Out[243]:
matrix([[1],
[1],
[1],
[1],
[1],
[1]], dtype=uint32)
In [244]: M/M.sum(1) # dense matrix - full size of M
Out[244]:
matrix([[ 1., 0., 0.],
[ 0., 1., 0.],
[ 0., 0., 1.],
[ 0., 1., 0.],
[ 0., 0., 1.],
[ 1., 0., 0.]])
That will explain the memory error - if M
is so large that M.A
produces a memory error.
In [262]: S = sparse.csr_matrix(M.sum(1))
In [263]: S.shape
Out[263]: (6, 1)
In [264]: M.shape
Out[264]: (6, 3)
In [265]: M/S
....
ValueError: inconsistent shapes
I'm not entirely sure what is going on here.
Element wise multiplication works
In [266]: M.multiply(S)
Out[266]:
<6x3 sparse matrix of type'<class 'numpy.uint32'>'with6 stored elements in Compressed Sparse Row format>
So it should work if I construct S
as S = sparse.csr_matrix(1/M.sum(1))
If some of the rows sum to zero, you have a division by zero problem.
If I modify M
to have 0 row
In [283]: M.A
Out[283]:
array([[1, 0, 0],
[0, 1, 0],
[0, 0, 0],
[0, 1, 0],
[0, 0, 1],
[1, 0, 0]], dtype=uint8)
In [284]: S = sparse.csr_matrix(1/M.sum(1))
/usr/local/bin/ipython3:1: RuntimeWarning: divide by zero encountered in true_divide
#!/usr/bin/python3
In [285]: S.A
Out[285]:
array([[ 1.],
[ 1.],
[ inf],
[ 1.],
[ 1.],
[ 1.]])
In [286]: M.multiply(S)
Out[286]:
<6x3 sparse matrix of type'<class 'numpy.float64'>'
with 5 stored elements in Compressed Sparse Row format>
In [287]: _.A
Out[287]:
array([[ 1., 0., 0.],
[ 0., 1., 0.],
[ 0., 0., 0.],
[ 0., 1., 0.],
[ 0., 0., 1.],
[ 1., 0., 0.]])
This isn't the best M
to demonstrate this on, but it suggests a useful approach. The row sum will be dense, so you can clean up its inverse using the usual dense array approaches.
Post a Comment for "Division Of Sparse Matrix"