1
votes

I am using the Sparcl package (https://cran.r-project.org/web/packages/sparcl/sparcl.pdf) to plot dendrograms in R. In my specific problem, I am clustering the groups according to one criterion, and I want to visualize by coloring based on another criterion (the point of this is to show that the cluster coincides (or does not), with another characteristic. I have been able to do this with the Sparcl package, to highlight the nodes that I want to emphasize:

df <- read.delim("the_data_matrix.txt");
d <- dist(as.matrix(df))
hc = hclust(d)
y[]='black'
y[list_of_nodes$V1]='red' # This will allow me to color only certain branches red, leaving the others black

If I plot with the standard plotting function, I can control various parameters, such as labels and text size with hang and cex (but cannot color any branches) (In the picture this is "Dendrogram 1"):

plot(hc,hang=-10,cex=.1)

On the other hand, if I plot using the ColorDendrogram function within Sparcl, I can get a colored dendrogram, but lose formatting options (In the picture this is "Dendrogram 2"):

ColorDendrogram(hc, y = y, branchlength = 4)

ColorDendrogram gave me errors when I used hang and cex to control text size and placement. enter image description here My Question

Does anyone know how to fix this, either within the Sparcl package or another one? I would like to have flexibility of color that ColorDendrogram has, but not lose formatting capabilities.

1
check out the ggtree package. It'll be a bit of learning but once you get it, trees are much easier to plot.jeremycg

1 Answers

1
votes

Try the package dendextend (vignette), which should give you all flexibility:

library(dendextend)
d1 <- mtcars %>% dist %>% hclust %>% as.dendrogram
d2 <- mtcars %>% dist(method="minkowski") %>% hclust(method="single") %>% as.dendrogram
vals <- grep("Merc", rownames(mtcars), val=T) # highlight branches leading to "Merc..."

par(mfrow=c(2, 1))
d1 %>% set("by_labels_branches_col", value = vals) %>% set("hang_leaves", -10) %>% set("labels_cex", .1) %>% plot
d2 %>% set("by_labels_branches_col", value = vals) %>% plot

enter image description here