9
votes

I want to create a correlation matrix plot, i.e. a plot where each variable is plotted in a scatterplot against each other variable like with pairs() or splom(). I want to do this with ggplot2. See here for examples. The link mentions some code someone wrote for doing this in ggplot2, however, it is outdated and no longer works (even after you swap out the deprecated parts).

One could do this with a loop in a loop and then multiplot(), but there must be a better way. I tried melting the dataset to long, and copying the value and variable variables and then using facets. This almost gives you something correct.

d = data.frame(x1=rnorm(100),
               x2=rnorm(100),
               x3=rnorm(100),
               x4=rnorm(100),
               x5=rnorm(100))
library(reshape2)
d = melt(d)
d$value2 = d$value
d$variable2 = d$variable

library(ggplot2)
ggplot(data=d, aes(x=value, y=value2)) +
  geom_point() +
  facet_grid(variable ~ variable2)

enter image description here

This gets the general structure right, but only works for the plotting each variable against itself. Is there some more clever way of doing this without resorting to 2 loops?

2

2 Answers

22
votes
library(GGally)

set.seed(42)
d = data.frame(x1=rnorm(100),
               x2=rnorm(100),
               x3=rnorm(100),
               x4=rnorm(100),
               x5=rnorm(100))

# estimated density in diagonal
ggpairs(d)

enter image description here

# blank
ggpairs(d, diag = list("continuous"="blank")

enter image description here

0
votes

Using PerformanceAnalytics library :

library("PerformanceAnalytics")
chart.Correlation(df, histogram = T, pch= 19)

enter image description here