Create a correlation matrix based on multiple column values with p-values in R

Question

I'm new to R and I'm trying to create a correlation matrix that will also include p-values.

The main issue I'm having is with computing correlations for specific numeric variables depending on the identity of three factors.

My data looks something like this

    data.frame(
      cond = c("low", "medium", "high"),
      group = c("gr1", "gr2", "gr3"),
      rand = c("yes", "no"),
      trial1 = rnorm(30),
      trial2 = rnorm(30))

I want to correlate trial1 and trial2 for each unique value in cond, group, and rand. Essentially, for each level of those factors, I would like to get an r- and p-value, and save them in a matrix.

I tried it the long way - extracting the observations that I want to correlate by using three logical tests like this(df$cond == "low") & (df$group == 'gr1') & (df&rand == 'yes'). This gave me what I needed but the code is very long and doesn't save the values in a matrix.

I've never tried for-loops before so I'd appreciate it if anyone knew either how to do that or another efficient way of doing it.

Thank you!

user2974951 user2974951 · Accepted Answer · 2019-08-22T11:15:13

I don't really understand what you are trying to do, but here is how you would estimate a correlation matrix with p-values for each possible combination of the three first variables

by(df[,c("trial1","trial2")],list(df$cond,df$group,df$rand),function(x){
  return(list(cor(x),cor.test(x[,1],x[,2])$p.value))
})

Create a correlation matrix based on multiple column values with p-values in R

2 Answers