Open Access Repository

Distinguishing convergence on phylogenetic networks


Downloads per month over past year

Mitchell, JD 2016 , 'Distinguishing convergence on phylogenetic networks', PhD thesis, University of Tasmania.

PDF (Whole thesis)
Mitchell_whole_...pdf | Download (1MB)
Available under University of Tasmania Standard License.

| Preview


In phylogenetics, the evolutionary history of a group of taxa, for example, groups of species,
genera or subspecies, can be modelled using a phylogenetic tree. Alternatively, we can model
evolutionary history with a phylogenetic network. On phylogenetic networks, edges that have
previously evolved independently from a common ancestor may subsequently converge for a
period of time. Examples of processes in biology that are better represented by networks than
trees are hybridisation, horizontal gene transfer and recombination.
Molecular phylogenetics uses information in biological sequences, for example, sequences of
DNA nucleotides, to infer a phylogenetic tree or network. This requires models of character
substitution. A group of these models is called the Abelian group-based models. The rate matrices
of the Abelian group-based models can be diagonalised in a process often referred to as Hadamard
conjugation in the literature. The time dependent probability distributions representing the
probabilities of each combination of states across all taxa at any site in the sequence are referred
to as phylogenetic tensors. The phylogenetic tensors representing a given tree or network can be
expressed in the diagonalised basis that may allow them to be analysed more easily. We look at
the diagonalising matrices of various Abelian group-based models in this thesis.
We compare the phylogenetic tensors for various trees and networks for two, three and four
taxa. If the probability spaces between one tree or network and another are not identical then
there will be phylogenetic tensors that could have arisen on one but not the other. We call these
two trees or networks distinguishable from each other. We show that for the binary symmetric
model there are no two-taxon trees and networks that are distinguishable from each other,
however there are three-taxon trees and networks that are distinguishable from each other.
We compare the time parameters for the phylogenetic tensors for various taxon label permutations
on a given tree or network. If the time parameters on one taxon label permutation in
terms of the other taxon label permutation are all non-negative then we say that the two taxon
label permutations are not network identifiable from each other. We show that some taxon label
permutations are network identifiable from each other.
We show that some four-taxon networks do not satisfy the four-point condition, while others
do. There are two “structures” of four-taxon rooted trees. One of these structures is defined by
the cluster, b,c,d, where the taxa are labelled alphabetically from left to right, starting with a.
The network with this structure and convergence between the two taxa with the root as their
most recent common ancestor satisfies the four-point condition.
The phylogenetic tensors contain polynomial equations that cannot be easily solved for fourtaxon
or higher trees or networks. We show how methods from algebraic geometry, such as
Gröbner bases, can be used to solve the polynomial equations. We show that some four-taxon
trees and networks can be distinguished from each other.

Item Type: Thesis - PhD
Authors/Creators:Mitchell, JD
Keywords: phylogenetics, phylogenetic networks, convergence, identifiability, Markov models, algebraic geometry, abelian groups, phylogenetic tensors
Copyright Information:

Copyright 2016 the Author

Item Statistics: View statistics for this item

Actions (login required)

Item Control Page Item Control Page