0
votes

The correlation matrix is a symmetric matrix, meaning that its upper diagonal and lower diagonal elements are mirror images of each other, together called off-diagonal elements (as opposed to the diagonal elements, which are all equal to 1 in any correlation matrix since any variable's correlation with itself is just 1).

The off-diagonal elements of a correlation matrix are the same wherever the i'th row number and j'th column number in the lower diagonal are swapped in the upper diagonal, i.e. correlation of variables 1 and 2 (row 1, column 2) are the same for variables 2 and 1 (row 2, column 1). Therefore, we only need to re-calculate the lower-diagonal elements, and copy them to corresponding positions in the matrix's upper-diagonal after

import numpy as np
from numpy.random import randn
X = randn(20,3)
Rho = np.corrcoef(X.T) #correlation matrix
print(np.tril(Rho)) #lower off-diagonal of matrix Rho to re-calculate, then copy to other side

shows

array([[ 1.        ,  0.        ,  0.        ],
       [-0.03003281,  1.        ,  0.        ],
       [-0.02602238,  0.06137713,  1.        ]])

What is the most efficient way to code a "i not-equal-to j" loop for the following sequence of steps:

  1. re-calculate the lower off-diagonal elements of the symmetric matrix according to some apply function (to make it simple, we will just add +2 to each of these elements)
  2. flip those same calculations onto its mirror image (the corresponding upper off-diagonals)
  3. Also, replace the diagonal elements of the symmetric matrix with a vector filled with 10's (instead of 1's as found in the correlation matrix)

The aim is to generate a new matrix that is a re-calculation of the original.

1

1 Answers

1
votes

Let us generate Rho first (note that I'm initializing the pseudo-random number generator in order to obtain the same Rho in different runs of the code):

In [526]: import numpy as np

In [527]: np.random.seed(0)
     ...: n = 3
     ...: X = np.random.randn(20, n)
     ...: Rho = np.corrcoef(X.T)

In [528]: Rho
Out[528]: 
array([[1.        , 0.03224462, 0.05021998],
       [0.03224462, 1.        , 0.15140358],
       [0.05021998, 0.15140358, 1.        ]])

Then you can use NumPy's tril_indices_from and advanced indexing to generate the new matrix:

In [548]: result = np.zeros_like(Rho)

In [549]: lrows, lcols = np.tril_indices_from(Rho, k=-1)

In [550]: result[lrows, lcols] = Rho[lrows, lcols] + 2

In [551]: result
Out[551]: 
array([[0.        , 0.        , 0.        ],
       [2.03224462, 0.        , 0.        ],
       [2.05021998, 2.15140358, 0.        ]])

In [552]: result[lcols, lrows] = result[lrows, lcols]

In [553]: result
Out[553]: 
array([[0.        , 2.03224462, 2.05021998],
       [2.03224462, 0.        , 2.15140358],
       [2.05021998, 2.15140358, 0.        ]])

In [554]: result[np.arange(n), np.arange(n)] = 10

In [555]: result
Out[555]: 
array([[10.        ,  2.03224462,  2.05021998],
       [ 2.03224462, 10.        ,  2.15140358],
       [ 2.05021998,  2.15140358, 10.        ]])