2
votes

I imported a Newick tree into R with ape (read.tree). The problem is that when I plot the tree the labels overlap because there are 1000 tips. I am not copying the tree in here because it is a very long expression. Is there a way to see which individual is at which tip of the tree?

I don't assume that of the 1000 specimen they are placed sequentially from 1 to 1000 at tips but that the tree somehow rearranged them... so I would need the new sequence of tip labels or something...

The tree looks like this:

![enter image description here][1]

just ignore the colours and I would need to be able to get the sequence of tips... just consider a single case in which the first tip from bottom would be tip 79, then the tip above that 82 and then 87 and so ....

consider a tree like here

((((penHA34a,penHA34b,penHA32b,penHA32a,penSH30b,penSH30a,penSH28b,penSH28a,penIT13b,penIT13a,penIT12a,firSA26b,firGU7b,firGU8b,firSP18b,firSP20b,firSP36b,firSP39b,penSH31a,penSH31b),(firSP19b,(firSP17b,penIT12b))),firSA24a,firSA24b,firSA25a,firSA26a,firGU7a,firGU8a,firSP17a,firSP18a,firSP19a,firSP20a,firSP36a,firSP39a,(firSA25b,firSP40b),firSP40a,penIT11b,penIT11a),(ovi47a,ovi47b));

Also, how can I obtain a vector with just the tip labels from that?

2
Can you add a screenshot of what your tree looks like?Jan Vladimir Mostert
they look like this dropbox.com/s/uis0worcafcwluw/tree_pos.pdf?dl=0 just ignore the colours on the tips....janbrei
Consider posting a small reproducible example any applying the answer to your bigger case.lukeA
@lukeA did that..have a lookjanbrei
Try tree <- ape::read.tree(file = textConnection("((((penHA34a,penHA34b,penHA32b,penHA32a,penSH30b,penSH30a,penSH28b,penSH28a,penIT13b,penIT13a,penIT12a,firSA26b,firGU7b,firGU8b,firSP18b,firSP20b,firSP36b,firSP39b,penSH31a,penSH31b),(firSP19b,(firSP17b,penIT12b))),firSA24a,firSA24b,firSA25a,firSA26a,firGU7a,firGU8a,firSP17a,firSP18a,firSP19a,firSP20a,firSP36a,firSP39a,(firSA25b,firSP40b),firSP40a,penIT11b,penIT11a),(ovi47a,ovi47b));")); tree$tip.label.lukeA

2 Answers

3
votes

It is going to be tough to make the tips of a 1000 taxa phylogeny readable, but three things that would help are 1) draw the tree as a fan, 2) decrease the size of the tip labels, and 3) decrease the figure margins.

# Load ape
library(ape)

# Generate 1000 taxa tree
tree <- rcoal(1000)

# Reduce figure margins to 0
par(mar=c(0,0,0,0))

# Plot fan tree with reduced tip label size
plot(tree, type="fan", cex=0.2)

I found the resulting tree to be readable, if you get real close and squint.

0
votes

data.tree is you friend. For example:

newick <- '((((penHA34a,penHA34b,penHA32b,penHA32a,penSH30b,penSH30a,penSH28b,penSH28a,penIT13b,penIT13a,penIT12a,firSA26b,firGU7b,firGU8b,firSP18b,firSP20b,firSP36b,firSP39b,penSH31a,penSH31b),(firSP19b,(firSP17b,penIT12b))),firSA24a,firSA24b,firSA25a,firSA26a,firGU7a,firGU8a,firSP17a,firSP18a,firSP19a,firSP20a,firSP36a,firSP39a,(firSA25b,firSP40b),firSP40a,penIT11b,penIT11a),(ovi47a,ovi47b));'
library(data.tree)
library(ape)
phylo <- read.tree(text = newick)
tree <- as.Node(phylo)

#find a specific individual:
tree$FindNode('firSA24b')$path

This will give you:

[1] "43/44/firSA24b"

You can print the entire tree or output it to a file to look for a particular node:

print(tree)
#print a sub-tree
tree$FindNode('45')
#print only part of the tree:
print(tree, pruneMethod = "dist", limit = 25)
#slightly more sophisticated:
print(tree, pruneFun = function(node) !node$isLeaf || node$position <= 5)
#or:
print(tree, pruneFun = function(node) !node$isLeaf || substr(node$name, 1, 4) == 'firS')

The last statement outputs this:

                      levelName
1  43                          
2   ¦--44                      
3   ¦   ¦--45                  
4   ¦   ¦   ¦--46              
5   ¦   ¦   ¦   ¦--firSA26b    
6   ¦   ¦   ¦   ¦--firSP18b    
7   ¦   ¦   ¦   ¦--firSP20b    
8   ¦   ¦   ¦   ¦--firSP36b    
9   ¦   ¦   ¦   °--firSP39b    
10  ¦   ¦   °--47              
11  ¦   ¦       ¦--firSP19b    
12  ¦   ¦       °--48          
13  ¦   ¦           °--firSP17b
14  ¦   ¦--firSA24a            
15  ¦   ¦--firSA24b            
16  ¦   ¦--firSA25a            
17  ¦   ¦--firSA26a            
18  ¦   ¦--firSP17a            
19  ¦   ¦--firSP18a            
20  ¦   ¦--firSP19a            
21  ¦   ¦--firSP20a            
22  ¦   ¦--firSP36a            
23  ¦   ¦--firSP39a            
24  ¦   ¦--49                  
25  ¦   ¦   ¦--firSA25b        
26  ¦   ¦   °--firSP40b        
27  ¦   °--firSP40a            
28  °--50    

Finding tips of the entire tree is also easy:

Get(tree$leaves, "name")

This will yield:

  penHA34a   penHA34b   penHA32b   penHA32a   penSH30b   penSH30a   penSH28b   penSH28a   penIT13b   penIT13a   penIT12a   firSA26b    firGU7b 
"penHA34a" "penHA34b" "penHA32b" "penHA32a" "penSH30b" "penSH30a" "penSH28b" "penSH28a" "penIT13b" "penIT13a" "penIT12a" "firSA26b"  "firGU7b" 
   firGU8b   firSP18b   firSP20b   firSP36b   firSP39b   penSH31a   penSH31b   firSP19b   firSP17b   penIT12b   firSA24a   firSA24b   firSA25a 
 "firGU8b" "firSP18b" "firSP20b" "firSP36b" "firSP39b" "penSH31a" "penSH31b" "firSP19b" "firSP17b" "penIT12b" "firSA24a" "firSA24b" "firSA25a" 
  firSA26a    firGU7a    firGU8a   firSP17a   firSP18a   firSP19a   firSP20a   firSP36a   firSP39a   firSA25b   firSP40b   firSP40a   penIT11b 
"firSA26a"  "firGU7a"  "firGU8a" "firSP17a" "firSP18a" "firSP19a" "firSP20a" "firSP36a" "firSP39a" "firSA25b" "firSP40b" "firSP40a" "penIT11b" 
  penIT11a     ovi47a     ovi47b 
"penIT11a"   "ovi47a"   "ovi47b" 

Or, you can do the same thing for a specific path:

Get(tree$Climb(44, 45, 47, 48)$children, "name")