I have a large sparse matrix which I need to correlate which haven't been possible for me because:
- I cant convert the sparse matrix to a dense matrix due to R's memory limitation
- I tried using packages
bigstats
andbigmemory
, my R froze over (using a windows 10, 8GB laptop) - There's no correlation function in R's
Matrix
package
N.B: I am trying to correlate a sparse matrix in the format 'M'X + X'M - M'M'
which isn't possible which is why I am trying to split the sparse matrix into two or three, turn to dense matrix using as.matrix()
then correlate using the cor()
then combine the correlated results into one using cbind()
Now:
I want to ask if it's possible to split a sparse matrix into two or three parts, convert to dense matrices then correlate each dense matrix then cbind the two or three dense matrices into one then export to a text file.
What function can I use to split a sparse matrix into two or three bearing in mind that both the i
and p
parts of the sparse matrix are equal sizes with the same dim
Formal class 'dgCMatrix' [package "Matrix"] with 7 slots
..@ i : int [1:73075722] ...
..@ p : int [1:73075722] 0 0 1 1 1 1 1 2 2 2 ...
..@ Dim : int [1:2] 500232 500232
..@ Dimnames:List of 2
.. ..$ : NULL
.. ..$ : NULL
..@ x : num [1:73075722] ...
..@ uplo : chr "L"
..@ factors : list()
The correlation output will be in this format:
[,1] [,2] [,3] [,4]
[1,] 1.00000000 -0.8343860 0.3612926 0.09678096
[2,] -0.83438600 1.0000000 -0.8154071 0.24611830
[3,] 0.36129256 -0.8154071 1.0000000 -0.51801346
[4,] 0.09678096 0.2461183 -0.5180135 1.00000000
[5,] 0.67411584 -0.3560782 -0.1056124 0.60987601
[6,] 0.23071712 -0.4457467 0.5117711 0.21848068
[7,] 0.49200080 -0.4246502 0.2016633 0.46971736
[,5] [,6] [,7]
[1,] 0.6741158 0.2307171 0.4920008
[2,] -0.3560782 -0.4457467 -0.4246502
[3,] -0.1056124 0.5117711 0.2016633
[4,] 0.6098760 0.2184807 0.4697174
[5,] 1.0000000 0.2007979 0.7198228
[6,] 0.2007979 1.0000000 0.6965899
[7,] 0.7198228 0.6965899 1.0000000