0
votes

I am following the recommendations of using rcorr plain correlation matrix using mtcars dataset using R. I would like to find the correlation for column; mpg to cyl, mpg to disp, mpg to hp and similarly for all other columns (multi sampling) for each of the cars listed as rownames. I understand it would create a large matrix of dataset but in my results for each of the correlation, I would like to know the rowname. My current code looks like this -

require(ggpubr)
require(tidyverse)
require(Hmisc)
require(corrplot)
data(mtcars)

flattenCorrMatrix <- function(cormat, pmat) {
  ut <- upper.tri(cormat)
  data.frame(
    row = rownames(cormat)[row(cormat)[ut]],
    column = rownames(cormat)[col(cormat)[ut]],
    cor  =(cormat)[ut],
    p = pmat[ut]
  )
}
tt <- mtcars

head(tt)
                   mpg cyl disp  hp drat    wt  qsec vs am gear carb
Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1

dm = data.matrix(tt)
cc = rcorr(dm, type="pearson")
rcc = flattenCorrMatrix(cc$r, cc$P)
rc = data.frame(rcc)
head(rc)

The result is

head(rc)
   row column     cor                 p
   mpg    cyl -0.8522 0.000000000611269
   mpg   disp -0.8476 0.000000000938033
   cyl   disp  0.9020 0.000000000001803
   mpg     hp -0.7762 0.000000178783525
   cyl     hp  0.8324 0.000000003477861
  disp     hp  0.7909 0.000000071426787

However I would like to know what car to which a correlation occurred i.e. add a column to the above data frame "car model". In this case, the car model is the rowname from mtcars(above - tt).

Any help to resolve this is appreciated.

1

1 Answers

0
votes

What you're asking isn't actually possible. That is because, each correlation listed above consist of data for multiple cars. For example, let's look at the first row:

   row column     cor                 p
   mpg    cyl -0.8522 0.000000000611269

This is a correlation between all values in the mpg column in your dataset and all values in the cyl column. so each row of your results is actually considering all cars in the mtcars dataset.