I am trying to plot the result of agglomerative clustering (UPGMA with Agnes) in the same 'style' as when plotting a tree using the package 'ape'. A simple example I include in the figure below
The key issue is that I want to be able to color the leaves of the dendrogram based on the a pattern in the labels of the leaves. I tried two approaches: either I used hc2Newick
or I used the code by Joris Meys as proposed in an answer to Change Dendrogram leaves . Both did not give a satisfactory output. It might be that I do not fully understand the way the dendrograms are constructed either. An ASCII save of the abundance.agnes.ave
object (stored from running agnes) can be found on https://www.dropbox.com/s/gke9qnvwptltkky/abundance.agnes.ave .
When I use the first option (with hc2Newick
from bioconductor's ctc
package) I get the following figure when using this code:
write(hc2Newick(as.hclust(abundance.agnes.ave)),file="all_samples_euclidean.tre")
eucltree<-read.tree(file="all_samples_euclidean.tre")
eucltree.laz<-ladderize(eucltree,FALSE)
tiplabs<-eucltree$tip.label
numbertiplabs<-length(tiplabs)
colourtips<-rep("green",numbertiplabs)
colourtips[grep("II",tiplabs)]<-"red"
plot(eucltree.laz,tip.color=colourtips,adj=1,cex=0.6,use.edge.length=F)
add.scale.bar()
This is obviously not ideal, the 'alignment' of the plot is not as I wanted. I supose this has to do with the branch length calculation however I do not have the foggiest idea how to solve this issue. Certainly when compared to the results of the colLab function, which look more like the dendrogram style I'd like to report. Also, using use.edge.length=T
in the code above does give a clustering that is not 'aligned' properly:
The second approach using Joris Meys' colLab function with the following code gives the next figure
clusDendro<-as.dendrogram(as.hclust(abundance.agnes.ave))
labelColors<-c("red","green")
clusMember<-rep(1,length(rownames(abundance.x)))
clusMember[grep("II",rownames(abundance.x))]<-2
names(clusMember)<-rownames(abundance.x)
colLab <- function(n)
{
if(is.leaf(n)) {
a <- attributes(n)
# clusMember - a vector designating leaf grouping
# labelColors - a vector of colors for the above grouping
labCol <- labelColors[clusMember[which(names(clusMember) == a$label)]]
attr(n, "nodePar") <- c(a$nodePar, lab.col = labCol)
}
n
}
clusDendro<-dendrapply(clusDendro, colLab)
plot(clusDendro,horiz=T,axes=F)
This plot is getting closer to what I want, however I do not know why the open circles appear at the leaves and how to remove them.
Any help is much appreciated.
Kind regards,
FM