Open Access Repository

Understanding the molecular epidemiology of Mycobacterium tuberculosis infection from whole-genome analyses


Downloads per month over past year

Gautam, S ORCID: 0000-0003-0187-6429 2019 , 'Understanding the molecular epidemiology of Mycobacterium tuberculosis infection from whole-genome analyses', PhD thesis, University of Tasmania.

PDF (Whole thesis)
Gautam_whole_th...pdf | Download (20MB)

| Preview


Tuberculosis (TB) is a major cause of global mortalities causing 1.6 million deaths in 2017 alone due to different species of Mycobacterium tuberculosis complex. Although Australia is a low TB burden country, a drop in current annual incidence by approximately 98% is required to achieve the World Health Organization’s TB elimination target by 2050. From a public health perspective, challenges exist with regard to tracing the source of TB and identifying factors underlying outbreaks of the disease. The availability of whole-genome sequencing (WGS) has enabled the exploration of the functional and evolutionary genomics of Mycobacterium tuberculosis, the detection of drug resistance conferring mutations, and the investigation of epidemiological clusters.
In this thesis, genomic analysis was applied to a dominant outbreak strain of M. tuberculosis, the “Rangipo” strain from New Zealand. Whole-genome sequencing of nine isolates representing the Rangipo genotype was performed on an Illumina Miseq platform. The sequence data of each isolate was mapped to the reference M. tuberculosis strain H37Rv to generate a consensus genome. There were 700 single locus variants present across all of the outbreak isolates when compared to H37Rv. These included polymorphisms in 12 loci involved in the virulence of M. tuberculosis. For example, non-synonymous polymorphism was found in the two-component response regulator phoR gene that is essential for the growth of M. tuberculosis in macrophage and mice. Furthermore, de novo assembly was performed on the sequence reads that did not map to the reference genome H37Rv. This detected the presence of five additional virulence related genes in the outbreak strain that were absent in H37Rv. These included transcriptional regulator EmbR2, molybdopterin cofactor (MoCo) biosynthesis proteins A and B. MoCo is the cofactor for the narGHI encoded nitrate reductase which is involved in the adaptation of M. tuberculosis in hypoxic conditions and persistence in the guinea pig lungs. These results highlight the presence of additional virulence related genes in the Rangipo outbreak strain that are not present in the reference genome, H37Rv.
Having successfully applied genomic analysis to a New Zealand outbreak strain, attention was then turned to TB in a local setting i.e. Tasmania, where published information on the molecular epidemiology of the disease was limited. I performed the whole-genome sequencing of M. tuberculosis isolated in Tasmania and analysed the genomic data together with public health surveillance records. A high proportion (>80%) of TB cases in Tasmania from 2014 to 2016 occurred in overseas born individuals. The whole-genome sequencing data determined the predominance of the East-African Indian lineage 3 of M. tuberculosis followed by the Euro-American lineage 4, Indo-Oceanic lineage 1 and East-Asian lineage 2 among TB cases in Tasmania.
Among the lineage 3 isolates, a possible cluster of TB was identified based on the single nucleotide polymorphism (SNP) difference being ≤5 for four of the isolates. Further investigation of the epidemiological data identified that the possible cluster of TB in Tasmania consisted of pulmonary TB cases reported in 2015 in patients originating from Nepal. In silico spoligotypes were generated for the clustered isolates and exactly matched the spoligotype of the dominant lineage 3 genotype, CAS1_Delhi in Nepal. This indicates that the probable origin of the strain of lineage 3 TB cluster cases in Tasmania was the Nepal region.
One of the lineage 2 isolates was collected from a case of extrapulmonary TB which occurred in a 37-year-old male individual originally from Vietnam. The patient had earlier tested positive in interferon gamma release assay in 2016 but did not exhibit the clinical signs and symptoms of pulmonary TB. Following an episode of colitis later in the year, the colon tissue biopsy specimen detected the presence of M. tuberculosis in the culture. The isolate was found to be resistant to isoniazid, rifampicin, pyrazinamide and ethambutol and therefore represented the first confirmed case of multi drug resistant (MDR) in Tasmania (TASMDR1). Epidemiological data revealed that a household contact of the Tasmanian MDR-TB patient was diagnosed with pulmonary MDR-TB in Vietnam in 2012. Both the 2016 Tasmanian and 2012 Vietnamese isolates were acquired and upon genome sequencing were found to possess identical high confidence mutations to isoniazid, rifampicin, pyrazinamide, ethambutol and also streptomycin. In addition, the two isolates differed by less than 5 SNPs which is strongly indicative that the two patients were part of the same transmission network. It is highly likely that the Tasmanian case contracted the MDR-TB strain from his household contact in Vietnam and the infection remained in the latent stage before reactivating in the extrapulmonary form in Tasmania in 2016.
In conclusion, this study highlights that differences in the genome content of TB outbreak strains may be undetectable when M. tuberculosis sequence data is mapped with a single reference strain. Furthermore, I conclude that the epidemiology of TB in the low prevalence setting of Tasmania has features that resemble TB in other jurisdictions, for example, the presence of the clustered cases and drug resistance.

Item Type: Thesis - PhD
Authors/Creators:Gautam, S
Keywords: tuberculosis, whole-genome sequencing, epidemiology, genotyping, transmission
DOI / ID Number: 10.25959/100.00031720
Copyright Information:

Copyright 2018 the author

Additional Information:

Chapter 2 appears to be the equivalent of a post-print version of an article published as: Gautam, S. S., Rajendra, K. C., Leong, K. W. C., Mac Aogáin, M., O’Toole, R. F., 2019. A step-by-step beginner’s protocol for whole genome sequencing of human bacterial pathogens, Journal of biological methods, 6(1) e110. The article is licensed under a Creative Commons Attribution 3.0 Unported (CC BY 3.0) License. (

Related URLs:
Item Statistics: View statistics for this item

Actions (login required)

Item Control Page Item Control Page