Open Access Repository
Distinguishing convergence on phylogenetic networks
Downloads
Downloads per month over past year

PDF
(Whole thesis)
Mitchell_whole_...pdf  Download (1MB) Available under University of Tasmania Standard License.  Preview 
Abstract
In phylogenetics, the evolutionary history of a group of taxa, for example, groups of species,
genera or subspecies, can be modelled using a phylogenetic tree. Alternatively, we can model
evolutionary history with a phylogenetic network. On phylogenetic networks, edges that have
previously evolved independently from a common ancestor may subsequently converge for a
period of time. Examples of processes in biology that are better represented by networks than
trees are hybridisation, horizontal gene transfer and recombination.
Molecular phylogenetics uses information in biological sequences, for example, sequences of
DNA nucleotides, to infer a phylogenetic tree or network. This requires models of character
substitution. A group of these models is called the Abelian groupbased models. The rate matrices
of the Abelian groupbased models can be diagonalised in a process often referred to as Hadamard
conjugation in the literature. The time dependent probability distributions representing the
probabilities of each combination of states across all taxa at any site in the sequence are referred
to as phylogenetic tensors. The phylogenetic tensors representing a given tree or network can be
expressed in the diagonalised basis that may allow them to be analysed more easily. We look at
the diagonalising matrices of various Abelian groupbased models in this thesis.
We compare the phylogenetic tensors for various trees and networks for two, three and four
taxa. If the probability spaces between one tree or network and another are not identical then
there will be phylogenetic tensors that could have arisen on one but not the other. We call these
two trees or networks distinguishable from each other. We show that for the binary symmetric
model there are no twotaxon trees and networks that are distinguishable from each other,
however there are threetaxon trees and networks that are distinguishable from each other.
We compare the time parameters for the phylogenetic tensors for various taxon label permutations
on a given tree or network. If the time parameters on one taxon label permutation in
terms of the other taxon label permutation are all nonnegative then we say that the two taxon
label permutations are not network identifiable from each other. We show that some taxon label
permutations are network identifiable from each other.
We show that some fourtaxon networks do not satisfy the fourpoint condition, while others
do. There are two “structures” of fourtaxon rooted trees. One of these structures is defined by
the cluster, b,c,d, where the taxa are labelled alphabetically from left to right, starting with a.
The network with this structure and convergence between the two taxa with the root as their
most recent common ancestor satisfies the fourpoint condition.
The phylogenetic tensors contain polynomial equations that cannot be easily solved for fourtaxon
or higher trees or networks. We show how methods from algebraic geometry, such as
Gröbner bases, can be used to solve the polynomial equations. We show that some fourtaxon
trees and networks can be distinguished from each other.
Item Type:  Thesis  PhD 

Authors/Creators:  Mitchell, JD 
Keywords:  phylogenetics, phylogenetic networks, convergence, identifiability, Markov models, algebraic geometry, abelian groups, phylogenetic tensors 
Copyright Information:  Copyright 2016 the Author 
Item Statistics:  View statistics for this item 
Actions (login required)
Item Control Page 