I have a sparse matrix stored on disk in coordinate format, (triplet format).
I would like to read chunks of the matrix into memory, using scipy.sparse
, however, when doing this, scipy will always assume a dense matrix indexing from 0,0
, regardless of the chunk.
This means, for example, that for the last 'chunk' in the sparse matrix scipy will interpret as being a huge matrix that only has some values in the bottom right corner.
How can I correctly handle the chunks so that when doing toarray
to create a dense matrix it only creates the subset corresponding to that chunk?
The reason for doing this is that, even sparse, the matrix is too large for memory (approx 600 million 32bit floating point values) and to display on screen (as the matrix represents a geospatial raster) I need to convert it to a dense matrix to store in a geospatial format (e.g. geotiff).