correlate a variable with multiple variables in r

Question

I need to correlate a gene with 47,000 other genes to find the 10 best correlation curves. Generally, my data frames have the gene names in the first column and the patients data in the next columns with gene names in the first row. Do I need to transpose the data frame to do the correlation tests? If I transpose, it works, but I believe there is a simpler way to do it. Can somebody help me?

pancreas_final <- read_delim("path", delim = "\t")
pancreas_final_t <- t(pancreas_final[,-1])
pancreas_final_t <- as.data.frame(pancreas_final_t)
names(pancreas_final_t) <- pancreas_final$X1
class(pancreas_final_t)
View(pancreas_final_t)

vec_cor <- cor(pancreas_final_t$CAMP, pancreas_final_t)
df_cor <- data_frame(gene = attributes(vec_cor)$dimnames[[2]], cor = c(vec_cor))
str(df_cor)

library(tidyverse)

df_cor %>%
  arrange(cor)

df_cor %>%
  arrange(desc(cor)) %>% 
  head(n = 10)

Seif Seif · Accepted Answer · 2019-11-19T13:32:35

You need to transpose your data frame if you want to calculate the correlation between the genes (rows in your data frame), try this for correlation between genes

correlation_btw_genes = cor(pancreas_final_t)

if you don't transpose your dataframe cor() function will calculate correlation between your patients

correlate a variable with multiple variables in r

1 Answers