Hello I would like to create heatmap presenting cofrequency of several variables Let's see some code:
a <- c(1,1,1,1)
b <-c(1,1,1,0)
c<- c(1,1,0,0)
d <- c(1,0,0,0)
df <- cbind(a,b,c,d)
df
a b c d
[1,] 1 1 1 1
[2,] 1 1 1 0
[3,] 1 1 0 0
[4,] 1 0 0 0
'1' represents occurence of a phenomenon '0' the phenonenon did not appear
a and b cofrequency is 75% a and c cofrequency is 50% ...
Finally, I would like to have 4x4 matrix with colnames on x and y axis and in tiles % of cofrequency a vs a = 100%, a vs. b = 75% etc.
May I ask for a little help?
Solutions from comments generate:
library(tidyr)
library(ggplot2)
a <- c(1,1,1,1)
b <-c(1,1,1,0)
c<- c(1,1,0,0)
d <- c(1,0,0,0)
df <- cbind(a,b,c,d)
calc_freq <- function(x, y) {
mean(df[, x] == df[, y] & df[, x] == 1 & df[, y] == 1)
}
mat <- outer(colnames(df), colnames(df), Vectorize(calc_freq))
mat
dimnames(mat) <- list(colnames(df), colnames(df))
mat %>% as_tibble() %>% gather %>% ggplot() + aes(key, value) + geom_tile()
I would rather to have % from mat
as fill and x-axis and y-axis as dinnames(mat)