Open Access Repository

How to make more from exposure data? an integrated machine learning pipeline to predict pathogen exposure

Fountain-Jones, NM, Machado, G, Carver, SS ORCID: 0000-0002-3579-7588, Packer, C, Recamonde-Mendoza, M and Craft, ME 2019 , 'How to make more from exposure data? an integrated machine learning pipeline to predict pathogen exposure' , Journal of Animal Ecology, vol. 88, no. 10 , pp. 1447-1461 , doi: 10.1111/1365-2656.13076.

Full text not available from this repository.

Abstract

Predicting infectious disease dynamics is a central challenge in disease ecology. Models that can assess which individuals are most at risk of being exposed to a pathogen not only provide valuable insights into disease transmission and dynamics but can also guide management interventions. Constructing such models for wild animal populations, however, is particularly challenging; often only serological data are available on a subset of individuals and nonlinear relationships between variables are common.Here we provide a guide to the latest advances in statistical machine learning to construct pathogen‐risk models that automatically incorporate complex nonlinear relationships with minimal statistical assumptions from ecological data with missing data. Our approach compares multiple machine learning algorithms in a unified environment to find the model with the best predictive performance and uses game theory to better interpret results. We apply this framework on two major pathogens that infect African lions: canine distemper virus (CDV) and feline parvovirus.Our modelling approach provided enhanced predictive performance compared to more traditional approaches, as well as new insights into disease risks in a wild population. We were able to efficiently capture and visualize strong nonlinear patterns, as well as model complex interactions between variables in shaping exposure risk from CDV and feline parvovirus. For example, we found that lions were more likely to be exposed to CDV at a young age but only in low rainfall years.When combined with our data calibration approach, our framework helped us to answer questions about risk of pathogen exposure that are difficult to address with previous methods. Our framework not only has the potential to aid in predicting disease risk in animal populations, but also can be used to build robust predictive models suitable for other ecological applications such as modelling species distribution or diversity patterns.

Item Type: Article
Authors/Creators:Fountain-Jones, NM and Machado, G and Carver, SS and Packer, C and Recamonde-Mendoza, M and Craft, ME
Keywords: boosted regression trees, disease ecology, gradient boosting models, machine learning; model-agnostic methods; random forests; serology; support vector machines
Journal or Publication Title: Journal of Animal Ecology
Publisher: Blackwell Publishing Ltd
ISSN: 0021-8790
DOI / ID Number: 10.1111/1365-2656.13076
Copyright Information:

Copyright 2019 The Authors. Journal of Animal Ecology © 2019 British Ecological Society

Related URLs:
Item Statistics: View statistics for this item

Actions (login required)

Item Control Page Item Control Page
TOP