I'm using the ggtree
package from Bioconductor to plot two phylogenetic trees. It works essentially like ggplot2, and I want to modify the aesthetics of the tip labels to match classes set by an external CSV file.
I have a multiPhylo object that contains two different clusterings of the same 50 genes (we'll pretend there are only 6 for this example). When I evaluate multitree[[1]]$tip.label
and multitree[[2]]$tip.label
they both give me the same list in the same order, so I know that while the plots are displayed differently, the genes are still stored in the same order.
library(ggtree)
library(ape)
mat <- as.dist(matrix(data = rexp(200, rate = 10), nrow = 6, ncol = 6))
nj.tree <- nj(mat) ### Package ape
hclust.tree <- as.phylo(hclust(mat))
multitree <- c(nj.tree, hclust.tree)
I want to plot these trees and then annotate them with external data based on which of 5 classes (A, B, C, D, and E) they are according to existing literature.
write.csv(multitree[[1]]$tip.label, "Genes.csv")
I used this command to create a CSV file of each of the genes in the right order (not sure if that's relevant). I then manually entered the corresponding class letter in the column adjascent to each gene. It looks something like this:
Gene Class
1 A
2 A
3 D
4 C
5 B
6 E
And so on.
I want to annotate the tip labels colors on my tree to correspond to the colors defined in my external CSV table. I know it would look something like geom_tiplab(aes(color=something something something))
, but I don't know how to make it so that it reads the data inside my CSV and not the data within the multitree
. Here's what my ggtree command looks like
myTree <- ggtree(multitree[[i]], aes(x, y)) +
ggtitle(names(multitree)[i]) +
geom_tiplab() + ### What I want to annotate with color
theme_tree2() +
coord_fixed(ratio = 0.5)
print(myTree) ###Occurs within a for loop, forces ggplot output to display
multitree[[1]]$tip.label
? There's are no corresponding values to match between them. – eipi10multitree[[1]]$tip.label
has values 1 through 50. There are no such values in your CSV example, so how does one figure out whichtip.label
corresponds to which row in the CSV file? Also, it would be helpful if you created a much smaller example, say, 5 or 6 tip labels, then created a data frame with (analogous to your CSV file) that matchestip.label
withClass
(which is what it seems like you're trying to do. – eipi10