Ellipses for groups on PCA from DESeq2

Question

I'd like to add in ellipses around my three groups (based on the variable "outcome") on the following plot. Note that vsd is a DESeq2 object with the factors outcome and batch:

pcaData <- plotPCA(vsd, intgroup=c("outcome", "batch"), returnData=TRUE)
percentVar <- round(100 * attr(pcaData, "percentVar"))
ggplot(pcaData, aes(PC1, PC2, color=outcome, shape=batch)) +
  geom_point(size=3) +
  xlab(paste0("PC1: ",percentVar[1],"% variance")) +
  ylab(paste0("PC2: ",percentVar[2],"% variance")) + 
  geom_text(aes(label=rownames(coldata_WM_D56C)),hjust=.5, vjust=-.8, size=3) +
  geom_density2d(alpha=.5) +
  coord_fixed()

I tried adding an ellipse, thinking it would inherit aesthetics from the top but it tried to make an ellipse for each point.

stat_ellipse() +

Too few points to calculate an ellipse

geom_path: Each group consists of only one observation. Do you need to adjust the group aesthetic?

Computation failed in stat_density2d(): missing value where TRUE/FALSE needed

Suggestions? Thanks in advance.

> dput(pcaData)
structure(list(PC1 = c(-15.646673151638, -4.21111051849254, 13.1215703467274, 
-6.5477433859415, -3.22129766721873, 4.59321517871152, 1.84089686598042, 
37.8415172383233, 40.9996810499267, 37.6089348653721, -24.5520575763498, 
-46.5840253031228, -4.01498554781508, -31.227922394463), PC2 = c(31.2712754127142, 
5.89621557021357, -10.2425538634254, -3.44497747426626, 2.21504480008043, 
0.315695833259479, -4.66467589267529, -4.27504355920903, -1.08666029542243, 
-2.69753368235982, 5.89767436709778, -24.2836532766506, 4.43980653642228, 
0.659385524221137), group = structure(c(4L, 5L, 6L, 7L, 8L, 5L, 
8L, 1L, 2L, 3L, 6L, 9L, 9L, 9L), .Label = c("ctrl : 1", "ctrl : 2", 
"ctrl : 3", "non : 1", "non : 2", "non : 3", "preg : 1", "preg : 2", 
"preg : 3"), class = "factor"), outcome = structure(c(2L, 2L, 
2L, 1L, 1L, 2L, 1L, 3L, 3L, 3L, 2L, 1L, 1L, 1L), .Label = c("preg", 
"non", "ctrl"), class = "factor"), batch = structure(c(1L, 2L, 
3L, 1L, 2L, 2L, 2L, 1L, 2L, 3L, 3L, 3L, 3L, 3L), .Label = c("1", 
"2", "3"), class = "factor"), name = structure(1:14, .Label = c("D5-R-N-1", 
"D5-R-N-2", "D5-R-N-3", "D5-R-P-1", "D5-R-P-2", "D5-Z-N-1", "D5-Z-P-1", 
"D6-C-T-1", "D6-C-T-2", "D6-C-T-3", "D6-Z-N-1", "D6-Z-P-1", "D6-Z-P-2", 
"D6-Z-P-3"), class = "factor")), .Names = c("PC1", "PC2", "group", 
"outcome", "batch", "name"), row.names = c("D5-R-N-1", "D5-R-N-2", 
"D5-R-N-3", "D5-R-P-1", "D5-R-P-2", "D5-Z-N-1", "D5-Z-P-1", "D6-C-T-1", 
"D6-C-T-2", "D6-C-T-3", "D6-Z-N-1", "D6-Z-P-1", "D6-Z-P-2", "D6-Z-P-3"
), class = "data.frame", percentVar = c(0.47709343625754, 0.0990361123451665
))

As Maurits Evers suggests, I've added a group aes, which only drew ellipses for 2 of 3 outcome types.

Your example is still not reproducible and self-consistent :coldata_WM_D56C is not defined anywhere. Either way, the plot based on my solution is as expected. You can't calculate/draw a confidence ellipse with only 3 points. Not sure what you expect. You can find details in ?stat_ellipse, which links to car::ellipse and Fox and Weisberg's "An R Companion to Applied Regression". — Maurits Evers
I didn't realize you need >3 points to calculate an ellipse, so thank you for that information and the help. — Stewart Russell

Maurits Evers Maurits Evers · Accepted Answer · 2017-11-24T00:03:14

Since you don't provide any sample data, here is an example using the faithful data.

The key is to add a group aesthetic.

require(ggplot2);

# Generate sample data
df <- faithful[1:10, ];
df$batch <- as.factor(rep(1:5, each = 2));

# This will throw a similar error/warning to yours
#ggplot(df, aes(waiting, eruptions, color = eruptions > 3, shape = batch)) + geom_point() + stat_ellipse();

# Add a group aesthetic and it works
ggplot(df, aes(waiting, eruptions, color = eruptions > 3, shape = batch, group = eruptions > 3)) + geom_point() + stat_ellipse();

So in your case, try adding aes(..., group = outcome).

Ellipses for groups on PCA from DESeq2

1 Answers