I am trying to create a dendrogram using the package dendextend. It creates really nice gg dendrograms but unfortunately when you turn it into a "circle", the labels do not keep up. I'll provide an example below.
My distance object is here: http://speedy.sh/JRVBS/mydist.RDS
library(dendextend)
library(ggplot2)
#library(devtools) ; install_github('kassambara/factoextra')
library(factoextra)
clus <- hcut(mydist, k = 6, hc_func = 'hclust',
hc_method = 'ward.D2', graph = FALSE, isdiss = TRUE)
dend <- as.dendrogram(clus)
labels(dend) <- paste0(paste0(rep(' ', 3), collapse = ''), labels(dend))
dend <- sort(dend, decreasing = FALSE)
ggd1 <- ggplot(dend %>%
set('branches_k_color', k = 6) %>%
set('branches_lwd', 0.6) %>%
set('labels_colors', k = 6) %>%
set('labels_cex', 0.6),
theme = theme_minimal(),
horiz = TRUE)
ggd1 <- ggd1 + theme(panel.grid.major = element_blank(),
axis.text = element_blank(),
axis.title = element_blank())
ggd1 <- ggd1 + ylim(max(get_branches_heights(dend)), -3)
This basically gives me this image: Which is great. However, I want to turn this into a circle, and so use:
ggd1 + coord_polar(theta = 'x')
And I get this graph below. This is close to exactly what I want, but I just need to rotate the labels.
Any help is appreciated. I know that under the hood dendextend is basically creating a few data.frames and then calling geom_segment()
and geom_text()
on them to create the dendrogram and labels. I believe I can expose the associated data.frame as follows:
back.df1 <- dendextend::as.ggdend(dend)
back.df2 <- dendextend::prepare.ggdend(back.df1)
Another tactic would possibly to be to use ggplot(labels = FALSE...)
when plotting, and then to add geom_text()
manually in some way that preserves the coloring but allows me to use geom_text(angle = )
.
I also suspect some combination of various ggplot wizardry would allow me to take back.df2
and create the 1st and second plots again, but also control the angle of the labels. However, I do not know how to do any of this, and have built out a lot already using the dendextend package and would ideally like to avoid having to use any new package for creating dendrogram objects because I really like this outside of the labels!
SOLUTION
I based this off the solution from Richard Telford below. I first created an edited version of the ggplot.ggdend()
. This is identical to the one provided in the answer below. I next created a function to automatically create the angle and hjust vectors so that the labels rotation switches from 6 o'clock to 12 o'clock to improve readability.
createAngleHJustCols <- function(labeldf) {
nn <- length(labeldf$y)
halfn <- floor(nn/2)
firsthalf <- rev(90 + seq(0,360, length.out = nn))
secondhalf <- rev(-90 + seq(0,360, length.out = nn))
angle <- numeric(nn)
angle[1:halfn] <- firsthalf[1:halfn]
angle[(halfn+1):nn] <- secondhalf[(halfn+1):nn]
hjust <- numeric(nn)
hjust[1:halfn] <- 0
hjust[(halfn+1):nn] <- 1
return(list(angle = angle, hjust = hjust))
}
I then produced the plot using the following code:
gdend <- dendextend::as.ggdend(dend %>%
set('branches_k_color', k = 6) %>%
set('branches_lwd', 0.6) %>%
set('labels_colors', k = 6) %>%
set('labels_cex', 0.6))
gdend$labels$angle <- ifelse(horiz, 0, 90)
gdend$labels$hjust <- 0
gdend$labels$vjust <- 0.5
# if polar, change the angle and hjust so that the labels rotate
if(polarplot) {
newvalues <- createAngleHJustCols(gdend$labels)
gdend$labels$angle <- newvalues[['angle']]
gdend$labels$hjust <- newvalues[['hjust']]
}
ggresult <- newggplot.ggdend(gdend, horiz = TRUE, offset_labels = -2)
ggresult <- ggresult + ggtitle(plottitle)
ggresult <- ggresult + theme(plot.margin = margin(c(2,2,2,2),
axis.text = element_blank(),
plot.title = element_text(margin = margin(10,2,2,2)))
ggresult <- ggresult + ylim(max(get_branches_heights(dend)), -5)
ggresult <- ggresult + coord_polar(theta = 'x', direction = 1)
And that ultimately produced this final plot!
(I changed a couple things in the data so some of the order may appear different in the plot)
load("mydist.RDS") Error: bad restore file magic number (file may be corrupted) -- no data loaded
. May be better to usedput
and include in your question (unless object is huge) – Richard Telford