Data-driven model discovery and model selection for noisy biological systems.

Biological systems exhibit complex dynamics that differential equations can often adeptly represent. Ordinary differential equation models are widespread; until recently their construction has required extensive prior knowledge of the system. Machine learning methods offer alternative means of model...

Full description

Saved in:
Bibliographic Details
Main Authors: Xiaojun Wu, MeiLu McDermott, Adam L MacLean
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2025-01-01
Series:PLoS Computational Biology
Online Access:https://doi.org/10.1371/journal.pcbi.1012762
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832540351714295808
author Xiaojun Wu
MeiLu McDermott
Adam L MacLean
author_facet Xiaojun Wu
MeiLu McDermott
Adam L MacLean
author_sort Xiaojun Wu
collection DOAJ
description Biological systems exhibit complex dynamics that differential equations can often adeptly represent. Ordinary differential equation models are widespread; until recently their construction has required extensive prior knowledge of the system. Machine learning methods offer alternative means of model construction: differential equation models can be learnt from data via model discovery using sparse identification of nonlinear dynamics (SINDy). However, SINDy struggles with realistic levels of biological noise and is limited in its ability to incorporate prior knowledge of the system. We propose a data-driven framework for model discovery and model selection using hybrid dynamical systems: partial models containing missing terms. Neural networks are used to approximate the unknown dynamics of a system, enabling the denoising of the data while simultaneously learning the latent dynamics. Simulations from the fitted neural network are then used to infer models using sparse regression. We show, via model selection, that model discovery using hybrid dynamical systems outperforms alternative approaches. We find it possible to infer models correctly up to high levels of biological noise of different types. We demonstrate the potential to learn models from sparse, noisy data in application to a canonical cell state transition using data derived from single-cell transcriptomics. Overall, this approach provides a practical framework for model discovery in biology in cases where data are noisy and sparse, of particular utility when the underlying biological mechanisms are partially but incompletely known.
format Article
id doaj-art-cd2754ab94f142429a92dec6e2fd42e9
institution Kabale University
issn 1553-734X
1553-7358
language English
publishDate 2025-01-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS Computational Biology
spelling doaj-art-cd2754ab94f142429a92dec6e2fd42e92025-02-05T05:30:38ZengPublic Library of Science (PLoS)PLoS Computational Biology1553-734X1553-73582025-01-01211e101276210.1371/journal.pcbi.1012762Data-driven model discovery and model selection for noisy biological systems.Xiaojun WuMeiLu McDermottAdam L MacLeanBiological systems exhibit complex dynamics that differential equations can often adeptly represent. Ordinary differential equation models are widespread; until recently their construction has required extensive prior knowledge of the system. Machine learning methods offer alternative means of model construction: differential equation models can be learnt from data via model discovery using sparse identification of nonlinear dynamics (SINDy). However, SINDy struggles with realistic levels of biological noise and is limited in its ability to incorporate prior knowledge of the system. We propose a data-driven framework for model discovery and model selection using hybrid dynamical systems: partial models containing missing terms. Neural networks are used to approximate the unknown dynamics of a system, enabling the denoising of the data while simultaneously learning the latent dynamics. Simulations from the fitted neural network are then used to infer models using sparse regression. We show, via model selection, that model discovery using hybrid dynamical systems outperforms alternative approaches. We find it possible to infer models correctly up to high levels of biological noise of different types. We demonstrate the potential to learn models from sparse, noisy data in application to a canonical cell state transition using data derived from single-cell transcriptomics. Overall, this approach provides a practical framework for model discovery in biology in cases where data are noisy and sparse, of particular utility when the underlying biological mechanisms are partially but incompletely known.https://doi.org/10.1371/journal.pcbi.1012762
spellingShingle Xiaojun Wu
MeiLu McDermott
Adam L MacLean
Data-driven model discovery and model selection for noisy biological systems.
PLoS Computational Biology
title Data-driven model discovery and model selection for noisy biological systems.
title_full Data-driven model discovery and model selection for noisy biological systems.
title_fullStr Data-driven model discovery and model selection for noisy biological systems.
title_full_unstemmed Data-driven model discovery and model selection for noisy biological systems.
title_short Data-driven model discovery and model selection for noisy biological systems.
title_sort data driven model discovery and model selection for noisy biological systems
url https://doi.org/10.1371/journal.pcbi.1012762
work_keys_str_mv AT xiaojunwu datadrivenmodeldiscoveryandmodelselectionfornoisybiologicalsystems
AT meilumcdermott datadrivenmodeldiscoveryandmodelselectionfornoisybiologicalsystems
AT adamlmaclean datadrivenmodeldiscoveryandmodelselectionfornoisybiologicalsystems