Data-driven model discovery and model selection for noisy biological systems.
Biological systems exhibit complex dynamics that differential equations can often adeptly represent. Ordinary differential equation models are widespread; until recently their construction has required extensive prior knowledge of the system. Machine learning methods offer alternative means of model...
Saved in:
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Public Library of Science (PLoS)
2025-01-01
|
Series: | PLoS Computational Biology |
Online Access: | https://doi.org/10.1371/journal.pcbi.1012762 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1832540351714295808 |
---|---|
author | Xiaojun Wu MeiLu McDermott Adam L MacLean |
author_facet | Xiaojun Wu MeiLu McDermott Adam L MacLean |
author_sort | Xiaojun Wu |
collection | DOAJ |
description | Biological systems exhibit complex dynamics that differential equations can often adeptly represent. Ordinary differential equation models are widespread; until recently their construction has required extensive prior knowledge of the system. Machine learning methods offer alternative means of model construction: differential equation models can be learnt from data via model discovery using sparse identification of nonlinear dynamics (SINDy). However, SINDy struggles with realistic levels of biological noise and is limited in its ability to incorporate prior knowledge of the system. We propose a data-driven framework for model discovery and model selection using hybrid dynamical systems: partial models containing missing terms. Neural networks are used to approximate the unknown dynamics of a system, enabling the denoising of the data while simultaneously learning the latent dynamics. Simulations from the fitted neural network are then used to infer models using sparse regression. We show, via model selection, that model discovery using hybrid dynamical systems outperforms alternative approaches. We find it possible to infer models correctly up to high levels of biological noise of different types. We demonstrate the potential to learn models from sparse, noisy data in application to a canonical cell state transition using data derived from single-cell transcriptomics. Overall, this approach provides a practical framework for model discovery in biology in cases where data are noisy and sparse, of particular utility when the underlying biological mechanisms are partially but incompletely known. |
format | Article |
id | doaj-art-cd2754ab94f142429a92dec6e2fd42e9 |
institution | Kabale University |
issn | 1553-734X 1553-7358 |
language | English |
publishDate | 2025-01-01 |
publisher | Public Library of Science (PLoS) |
record_format | Article |
series | PLoS Computational Biology |
spelling | doaj-art-cd2754ab94f142429a92dec6e2fd42e92025-02-05T05:30:38ZengPublic Library of Science (PLoS)PLoS Computational Biology1553-734X1553-73582025-01-01211e101276210.1371/journal.pcbi.1012762Data-driven model discovery and model selection for noisy biological systems.Xiaojun WuMeiLu McDermottAdam L MacLeanBiological systems exhibit complex dynamics that differential equations can often adeptly represent. Ordinary differential equation models are widespread; until recently their construction has required extensive prior knowledge of the system. Machine learning methods offer alternative means of model construction: differential equation models can be learnt from data via model discovery using sparse identification of nonlinear dynamics (SINDy). However, SINDy struggles with realistic levels of biological noise and is limited in its ability to incorporate prior knowledge of the system. We propose a data-driven framework for model discovery and model selection using hybrid dynamical systems: partial models containing missing terms. Neural networks are used to approximate the unknown dynamics of a system, enabling the denoising of the data while simultaneously learning the latent dynamics. Simulations from the fitted neural network are then used to infer models using sparse regression. We show, via model selection, that model discovery using hybrid dynamical systems outperforms alternative approaches. We find it possible to infer models correctly up to high levels of biological noise of different types. We demonstrate the potential to learn models from sparse, noisy data in application to a canonical cell state transition using data derived from single-cell transcriptomics. Overall, this approach provides a practical framework for model discovery in biology in cases where data are noisy and sparse, of particular utility when the underlying biological mechanisms are partially but incompletely known.https://doi.org/10.1371/journal.pcbi.1012762 |
spellingShingle | Xiaojun Wu MeiLu McDermott Adam L MacLean Data-driven model discovery and model selection for noisy biological systems. PLoS Computational Biology |
title | Data-driven model discovery and model selection for noisy biological systems. |
title_full | Data-driven model discovery and model selection for noisy biological systems. |
title_fullStr | Data-driven model discovery and model selection for noisy biological systems. |
title_full_unstemmed | Data-driven model discovery and model selection for noisy biological systems. |
title_short | Data-driven model discovery and model selection for noisy biological systems. |
title_sort | data driven model discovery and model selection for noisy biological systems |
url | https://doi.org/10.1371/journal.pcbi.1012762 |
work_keys_str_mv | AT xiaojunwu datadrivenmodeldiscoveryandmodelselectionfornoisybiologicalsystems AT meilumcdermott datadrivenmodeldiscoveryandmodelselectionfornoisybiologicalsystems AT adamlmaclean datadrivenmodeldiscoveryandmodelselectionfornoisybiologicalsystems |