2. Getting a piece of the Synthetic Open Tree of Life
Overview
Teaching: 5 min
Exercises: 5 minQuestions
What is the synthetic Open Tree of Life?
How do I interact with it?
Why is my taxon not in the tree?
Objectives
Get an induced subtree
Get a subtree
The synthetic Open Tree of Life (synthetic OTOL from now on) summarizes information from 1216 trees from 1162 peer-reviewed and published studies, that have been uploaded to the OTOL database through a curator system.
Functions from the rotl
package that interact with the synthetic OTOL start with “tol_”.
To access general information about the current synthetic OTOL, we can use the function tol_about()
. This function requires no argument.
rotl::tol_about()
OpenTree Synthetic Tree of Life.
Tree version: opentree12.3
Taxonomy version: 3.2draft9
Constructed on: 2019-12-23 11:41:23
Number of terminal taxa: 2391916
Number of source trees: 1216
Number of source studies: 1162
Source list present: false
Root taxon: cellular organisms
Root ott_id: 93302
Root node_id: ott93302
This is nice!
As you can note, the current synthetic OTOL was created not too long ago, on 2019-12-23 11:41:23.
This is also telling us that there are currently more than 2 million tips on the synthetic OTOL.
It is indeed a large tree. So, what if we just want a small piece of the whole synthetic OTOL?
Well, now that we have some interesting taxon OTT ids, we can easily do this.
Getting an induced subtree
The function tol_induced_subtree()
allows us to get a tree of taxa from different taxonomic ranks.
my_tree <- rotl::tol_induced_subtree(resolved_names$ott_id)
Warning in collapse_singles(tr, show_progress): Dropping singleton nodes
with labels: Mammalia ott244265, Theria (subclass in Deuterostomia)
ott229558, Eutheria (in Deuterostomia) ott683263, Boreoeutheria ott5334778,
Laurasiatheria ott392223, mrcaott1548ott6790, mrcaott1548ott3607484,
mrcaott1548ott4942380, mrcaott1548ott4942547, mrcaott1548ott3021, Artiodactyla
ott622916, mrcaott1548ott21987, mrcaott1548ott5256, mrcaott5256ott4944931,
Whippomorpha ott7655791, Cetacea ott698424, mrcaott5256ott3615450,
mrcaott5256ott44568, Odontoceti ott698417, mrcaott5256ott5269,
mrcaott5269ott6470, mrcaott5269ott47843, mrcaott47843ott194312,
mrcaott4697ott263949, Carnivora ott44565, Caniformia ott827263,
Canidae ott770319, mrcaott47497ott3612617, mrcaott47497ott3612529,
mrcaott47497ott3612596, mrcaott47497ott3612516, mrcaott47497ott3612589,
mrcaott47497ott3612591, mrcaott47497ott3612592, mrcaott47497ott77889,
Feliformia ott827259, mrcaott6940ott19397, mrcaott19397ott194349, Felidae
ott563159, mrcaott54737ott660452, mrcaott54737ott86170, mrcaott54737ott86175,
mrcaott54737ott442049, mrcaott54737ott86162, mrcaott54737ott86166, Sauropsida
ott639642, Sauria ott329823, mrcaott246ott4128455, mrcaott246ott4127082,
mrcaott246ott4129629, mrcaott246ott4142716, mrcaott246ott4126667,
mrcaott246ott1662, mrcaott246ott2982, mrcaott246ott31216, mrcaott246ott4947920,
mrcaott246ott4127428, mrcaott246ott4126230, mrcaott246ott4127421,
mrcaott246ott664349, mrcaott246ott4126505, mrcaott246ott4127015,
mrcaott246ott4129653, mrcaott246ott4127541, mrcaott246ott4946623,
mrcaott246ott4126482, mrcaott246ott4128105, mrcaott246ott4127288,
mrcaott246ott4132146, mrcaott246ott3602822, mrcaott246ott4143599,
mrcaott246ott3600976, mrcaott246ott4132107, Aves ott81461, Neognathae
ott241846, mrcaott246ott5481, mrcaott246ott5021, mrcaott246ott7145,
mrcaott246ott5272, mrcaott5272ott9830, mrcaott9830ott86672, mrcaott9830ott90560,
mrcaott9830ott18206, mrcaott18206ott60413, Sphenisciformes ott494366
Note: What does this warning mean?
This warning has to do with the way the synthetic OTOL is generated. You can look at the overview of the synthesis algorithm for more information.
Let’s look at the output of tol_induced_subtree()
.
my_tree
Phylogenetic tree with 5 tips and 4 internal nodes.
Tip labels:
[1] "Delphinidae_ott698406" "mrcaott47497ott110766" "Felis_ott563165"
[4] "Spheniscidae_ott494367" "Amphibia_ott544595"
Node labels:
[1] "Tetrapoda ott229562" "Amniota ott229560" "mrcaott1548ott4697"
[4] "mrcaott4697ott6940"
Rooted; no branch lengths.
R is telling us that we have a rooted tree with no branch lengths and 5 tips. If we check the class of the output, we will verify that it is a ‘phylo’ object.
class(my_tree)
[1] "phylo"
A ‘phylo’ object is a data structure that stores the necessary information to build a tree.
There are several functions from different packages to plot trees or ‘phylo’ objects in R (e.g., phytools). For now, we will use the one from the legendary ape
package plot.phylo()
:
ape::plot.phylo(my_tree, cex = 2) # or just plot(my_tree, cex = 2)
This is cool!
But, why oh why did my Canis disappear? 😢
Well, it did not actually disappear, it was replaced by the label “mrcaott47497ott110766”.
We will explain why this happens in the next section.
Now, what if you want a piece of the synthetic OTOL containing all descendants of your taxa of interest?
Getting a subtree of one taxon
We can extract a subtree of all descendants of one taxon at a time using the function tol_subtree()
and an OTT id of your choosing.
Let’s extract a subtree of all amphibians.
First, get its OTT id. It is already stored in our resolved_names
object:
amphibia_ott_id <- resolved_names["Amphibia",]$ott_id
Or, you can run the function tnrs_match_names()
again if you want.
amphibia_ott_id <- rotl::tnrs_match_names("amphibians")$ott_id
Now, extract the subtree from the synthetic OTOL using tol_subtree()
.
amphibia_subtree <- rotl::tol_subtree(ott_id = resolved_names["Amphibia",]$ott_id)
Let’s look at the output:
amphibia_subtree
Phylogenetic tree with 10012 tips and 3100 internal nodes.
Tip labels:
Odorrana_geminata_ott114, Odorrana_supranarina_ott14375, Odorrana_narina_ott14379, Odorrana_amamiensis_ott14384, Odorrana_utsunomiyaorum_ott14377, Odorrana_swinhoana_ott14392, ...
Node labels:
Amphibia ott544595, Batrachia ott471197, Anura ott991547, , , , ...
Unrooted; no branch lengths.
This is a large tree! We will have a hard time plotting it.
Now, let’s extract a subtree for the genus Canis. It should be way smaller!
subtree <- rotl::tol_subtree(resolved_names["Canis",]$ott_id)
Error: HTTP failure: 400
list(contesting_trees = list(`ot_278@tree1` = list(attachment_points = list(list(children_from_taxon = list("node242"), parent = "node241"), list(children_from_taxon = list("node244"), parent = "node243"), list(children_from_taxon = list("node262"), parent = "node255"), list(children_from_taxon = list("node270"), parent = "node267"))), `ot_328@tree1` = list(attachment_points = list(list(children_from_taxon = list("node519"), parent = "node518"), list(children_from_taxon = list("node523"), parent = "node522")))),
mrca = "mrcaott47497ott110766")[/v3/tree_of_life/subtree] Error: node_id was not found (broken taxon).
😱 😱 😱
What does this error mean??
A “broken” taxon error usually happens when phylogenetic information does not match taxonomic information.
For example, extinct lineages are sometimes phylogenetically included within a taxon but are taxonomically excluded, making the taxon appear as paraphyletic.
On the Open Tree of Life browser, we can still get to the subtree (check it out here).
From R, we will need to do something else first. We will get to that on the next episode.
Key Points
OTT ids and node ids allow us to interact with the synthetic OTOL.
Portions of the synthetic OTOL can be extracted from a single OTT id or from a bunch of them.
It is not possible to get a subtree from an OTT id that is not in the synthetic tree.