3
votes

From a dataframe data.main, I am able to generate a hclust dendrogram as,

aa1<- c(2,4,6,8)
bb1<- c(1,3,7,11)
aa2<-c(3,6,9,12)
bb2<-c(3,5,7,9)
data.main<- data.frame(aa1,bb1,aa2,bb2)
d1<-dist(t(data.main))
hcl1<- hclust(d1)
plot(hcl1)

Further, I know there are ways to use a tree cutoff to color the branches or leaves. However, is it possible to color them based on partial column names or column number (e.g. I want that branch corresponding to aa1, aa2 be red and bb1 and bb2 be blue)?

I have checked the R package dendextend but am still not able to find a direct/easy way to get the desired result.

dendrogram with <code>aa2</code> and <code>bb2</code> clustered most closely. Then <code>bb1</code> is next closest, followed by <code>aa1</code>. The labels and branches are colored based on the label. Those starting with "aa" are red and those starting with "bb" are blue.

3
Please include a reproducible example with sample input data and describe what you would like the output to look like for that specific data. This will make it much easier to help you.MrFlick
i have edited the question and hope that it is more clear now.Polar.Ice
@MrFlick, sorry for the confusion. In earlier edit, although i mentioned that "I want that branch corresponding to aa1, aa2 be red and bb1 and bb2 be blue" i didn't provide right figure.Polar.Ice

3 Answers

3
votes

It's easier to change colors for a dendrogram than an hclust object, but it's pretty straightforward to convert. You can do

drg1 <- dendrapply(as.dendrogram(hcl1, hang=.1), function(n){
  if(is.leaf(n)){
    labelCol <- c(a="red", b="blue")[substr(attr(n,"label"),1,1)];
    attr(n, "nodePar") <- list(pch = NA, lab.col = labelCol);
    attr(n, "edgePar") <- list(col = labelCol); # to color branch as well
  }
  n;
});
plot(drg1)

which will draw

enter image description here

0
votes

UPDATE

I'm only leaving my answer because it is valid and someone might find OOMPA useful. However, after seeing the solution of using dendrapply as suggested by MrFlick, I recommend it instead. You might find other features of the OOMPA package useful, but I wouldn't install it just for functionality that already exists in core R.


Original Answer

Install OOMPA (Object-Oriented Microarray and Proteomics Analysis package):

source("http://silicovore.com/OOMPA/oompaLite.R")
oompaLite()

Then use the plotColoredClusters function from the library ClassDiscovery:

library(ClassDiscovery)
aa1<- c(2,4,6,8)
bb1<- c(1,3,7,11)
aa2<-c(3,6,9,12)
bb2<-c(3,5,7,9)
data.main<- data.frame(aa1,bb1,aa2,bb2)
d1<-dist(t(data.main))
hcl1<- hclust(d1)

#identify the labels
labels=hcl1[4]$labels

# Choose which ones are in the "aa" group
aa_present <- grepl("aa", labels)

colors <- ifelse(aa_present, "red", "blue")

plotColoredClusters(hcl1,labs=labels,cols=colors)

Result:

Cluster diagram with aa2 and aa1 both colored red while bb1 and bb2 are colored blue

0
votes

ice, the dendextend package allows to do this using the assign_values_to_leaves_edgePar function.

Here is how to use it:

aa1 <- c(2,4,6,8)
bb1 <- c(1,3,7,11)
aa2 <- c(3,6,9,12)
bb2 <- c(3,5,7,9)
data.main <- data.frame(aa1,bb1,aa2,bb2)
d1 <- dist(t(data.main))
hcl1 <- hclust(d1)
# plot(hcl1)

dend <- as.dendrogram(hcl1)
col_aa_red <- ifelse(grepl("aa", labels(dend)), "red", "blue")
dend2 <- assign_values_to_leaves_edgePar(dend=dend, value = col_aa_red, edgePar = "col")
plot(dend2)

Result:

enter image description here