# Uncovering genetic risk factors in multiple sclerosis using a family-based approach

Chen, M ORCID: 0000-0002-0955-7601 2022 , 'Uncovering genetic risk factors in multiple sclerosis using a family-based approach', PhD thesis, University of Tasmania.

 Preview
PDF (Whole thesis)

| Preview

## Abstract

Multiple sclerosis (MS) is one of the most common diseases of the brain and spinal cord, characterised by demyelination and neurodegeneration. Family and twin studies established the significant roles of genetic factors in the aetiology of MS. Genome-wide association studies have identified over 200 common variants associated with MS risk. The human leukocyte antigen (HLA) haplotype DRB1*15:01 remains the largest risk variant with an average odds ratio of 3.08 in Europeans. However, these variants collectively explain only 20 – 48% of the overall heritability, leaving a major proportion of heritability unknown. A recent study of a cohort of 32,367 MS cases and 36,012 controls demonstrated that an additional 5% of the heritability was explained by low-frequency variation, highlighting the need to identify rare variant classes influencing the pathogenesis of MS. Given the relatively low prevalence of MS (approximately 1 in per 1,000 Australians) compared to other complex diseases, sporadic clustering within families is unlikely, yet familial clustering of the disease does occur, suggesting underlying shared genetic risk. Families allow for enrichment of rare or private variants, by transmission from founders to multiple offspring, powering the investigation of these variant classes. This dissertation explores the analysis of families enriched for MS to expand the understanding of the genetic underpinnings of MS.
The first aim of this study was to assess the burden of the known common risk variants in our three MS enriched families (Family 1, 2 and 3). To do this, we used the common risk variants from the recent International Multiple Sclerosis Genetics Consortium study to construct a weighted polygenic risk score (wPRS). A cohort of 3,252 MS cases and 5,725 healthy controls from Australia and New Zealand was used to compute the baseline wPRS. Twenty-six out of 32 known HLA variants and 155 out of 200 non-HLA variants were typed in the cohort. The cohort wPRS based on these variants exhibited a reasonable capacity to discriminate between sporadic MS cases and healthy controls (area under the receiver operating characteristic curve is 0.76, 95% confidence interval: 0.75 – 0.77). The wPRS was significantly higher in sporadic MS (mean x̅± standard deviation, SD = 25.71 ± 1.27) compared with healthy controls (24.48 ± 1.21; P = 2.2 × 10$$^{-16}$$). We used the same set of common variants to construct the familial wPRS. There was no significant difference in wPRS between the familial and sporadic MS cases. However, we observed significant difference between the familial unaffected individuals (25.82 ± 1.14) and the healthy controls (24.48 ± 1.21; P = 0.01). This was mainly due to the enriched copies of the high-risk HLA-DRB1*15:01 alleles in Family 1. These results demonstrate that common variants are not sufficient to explain the aggregation of MS in our families. Therefore, genetic factors other than common variants are likely to influence the disease clustering.
The second aim of this study was to identify rare or novel variants that may predispose some family members to MS. To do this, whole-genome sequencing was conducted on all available family members (19 individuals from three MS enriched families). A full penetrance model and a reduced penetrance model were used to identify rare or novel coding and non-coding variants in each family. Multiple in silico algorithms and knowledge-driven prioritisation were used to predict the potential impact of the variants on the target genes and proteins. As a result, we identified eight candidate variants for Family 1, eight for Family 2 and 14 for Family 3. We then applied diverse and complementary approaches to further assess the corresponding candidate genes and/or pathways using validation datasets. Three large cohorts were used to determine the enrichment of variants in the genes (and/or pathways) in MS patients compared with healthy controls. These cohort data include: 3,252 MS patients and 5,725 healthy controls that were genotyped on mixed arrays (cohort 1); 3,318 MS patients and 2,891 healthy controls that underwent whole-exome sequencing (cohort 2); and GWAS summary statistics of 14,802 MS cases and 26,703 healthy controls (cohort 3). A publicly available single-nucleus RNA sequencing dataset of the white matter areas of post-mortem human brains in five healthy controls and four MS patients was evaluated to investigate the cell-specific differential expression of the candidate genes in MS brain cells compared with healthy brain cells.
The final candidate variants prioritised by the downstream analyses in Family 1 were GRIK4 c.2070G>A, DTNB c.239C>T and SLMAP c.991A>T. Gene-level enrichment was observed for GRIK4 (P = 0.003), SLMAP (P = 0.007) in cohort 3, but the enrichment did not survive statistical correction for multiple comparisons. Significant differential expression of DTNB (corrected P = 0.02) and GRIK4 (corrected P = 0.02) between MS brain cells and healthy brain cells were observed in the oligodendrocyte lineage 5 and the oligodendrocyte lineage 6, respectively. The final candidate variants prioritised in Family 2 were RGS14 c.1069C>G and PARD3 c.2321G>T. Gene-level enrichment evidence was observed for RGS14 (corrected P = 5.43 × 10$$^{-6}$$) in cohort 3. Gene-level enrichment was also observed for PARD3 (P = 0.031) in cohort 2, but it did not survive statistical correction for multiple comparisons. The final candidate variants prioritised in Family 3 were ZNF18 c.1072G>A and KANK1 c.3371G>A. x Gene-level enrichment evidence was observed for both genes (ZNF18: P = 3.22 × 10$$^{-4}$$, KANK1: P = 0.004) in cohort 1, and ZNF18 survived statistical correction for multiple comparisons (P = 0.01). Enrichment in KANK1 was also observed in cohort 2 (P = 0.047) and 3 (P = 0.036), but it did not survive statistical correction for multiple comparisons (P > 0.05).
In summary, our study demonstrated that the currently known common risk variants of MS are not sufficient to explain the disease clustering in our families. By comparing the genome differences of the affected and unaffected individuals, we identified potentially important rare or novel genetic variants that may influence disease risk in these families. We provided additional supportive evidence for the candidate genes (and pathways) by analysing validation datasets. This study provides the basis for future targeted investigations into the potential role of those candidate genes in MS pathogenesis, which will facilitate better understanding of the genetic aetiology of the disease.

Item Type: Thesis - PhD Chen, M Multiple sclerosis, family-based study, common variants, polygenic risk score, rare variants, whole-genome-sequencing https://doi.org/10.25959/100.00047556 Copyright 2022 the author View statistics for this item