Path-minimizing latent ODEs for improved extrapolation and inference

Latent ordinary differential equation (ODE) models provide flexible descriptions of dynamic systems, but they can struggle with extrapolation and predicting complicated non-linear dynamics. The latent ODE approach implicitly relies on encoders to identify unknown system parameters and initial condit...

Full description

Saved in:
Bibliographic Details
Main Authors: Matt L Sampson, Peter Melchior
Format: Article
Language:English
Published: IOP Publishing 2025-01-01
Series:Machine Learning: Science and Technology
Subjects:
Online Access:https://doi.org/10.1088/2632-2153/addc34
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849688951217979392
author Matt L Sampson
Peter Melchior
author_facet Matt L Sampson
Peter Melchior
author_sort Matt L Sampson
collection DOAJ
description Latent ordinary differential equation (ODE) models provide flexible descriptions of dynamic systems, but they can struggle with extrapolation and predicting complicated non-linear dynamics. The latent ODE approach implicitly relies on encoders to identify unknown system parameters and initial conditions (ICs), whereas the evaluation times are known and directly provided to the ODE solver. This dichotomy can be exploited by encouraging time-independent latent representations. By replacing the common variational penalty in latent space with an $\ell_2$ penalty on the path length of each system, the models learn data representations that can easily be distinguished from those of systems with different configurations. This results in faster training, smaller models, more accurate interpolation and long-time extrapolation compared to the baseline ODE models with a gated recurent unit (GRU), recurrent neural network, and long-short-term-memory encoder/decoders on tests with damped harmonic oscillator, self-gravitating fluid, and predator-prey systems. We also demonstrate superior results for simulation-based inference of the Lotka–Volterra parameters and ICs by using the latents as data summaries for a conditional normalizing flow. Our change to the training loss is agnostic to the specific recognition network used by the decoder and can therefore easily be adopted by other latent ODE models.
format Article
id doaj-art-dee30643101c468bacb5636b77441ead
institution DOAJ
issn 2632-2153
language English
publishDate 2025-01-01
publisher IOP Publishing
record_format Article
series Machine Learning: Science and Technology
spelling doaj-art-dee30643101c468bacb5636b77441ead2025-08-20T03:21:47ZengIOP PublishingMachine Learning: Science and Technology2632-21532025-01-016202504710.1088/2632-2153/addc34Path-minimizing latent ODEs for improved extrapolation and inferenceMatt L Sampson0https://orcid.org/0000-0001-5748-5393Peter Melchior1https://orcid.org/0000-0002-8873-5065Department of Astrophysical Sciences, Princeton University , Princeton, NJ 08544, United States of AmericaDepartment of Astrophysical Sciences, Princeton University , Princeton, NJ 08544, United States of America; Center for Statistics and Machine Learning, Princeton University , Princeton, NJ 08544, United States of AmericaLatent ordinary differential equation (ODE) models provide flexible descriptions of dynamic systems, but they can struggle with extrapolation and predicting complicated non-linear dynamics. The latent ODE approach implicitly relies on encoders to identify unknown system parameters and initial conditions (ICs), whereas the evaluation times are known and directly provided to the ODE solver. This dichotomy can be exploited by encouraging time-independent latent representations. By replacing the common variational penalty in latent space with an $\ell_2$ penalty on the path length of each system, the models learn data representations that can easily be distinguished from those of systems with different configurations. This results in faster training, smaller models, more accurate interpolation and long-time extrapolation compared to the baseline ODE models with a gated recurent unit (GRU), recurrent neural network, and long-short-term-memory encoder/decoders on tests with damped harmonic oscillator, self-gravitating fluid, and predator-prey systems. We also demonstrate superior results for simulation-based inference of the Lotka–Volterra parameters and ICs by using the latents as data summaries for a conditional normalizing flow. Our change to the training loss is agnostic to the specific recognition network used by the decoder and can therefore easily be adopted by other latent ODE models.https://doi.org/10.1088/2632-2153/addc34latentODEsregularization
spellingShingle Matt L Sampson
Peter Melchior
Path-minimizing latent ODEs for improved extrapolation and inference
Machine Learning: Science and Technology
latent
ODEs
regularization
title Path-minimizing latent ODEs for improved extrapolation and inference
title_full Path-minimizing latent ODEs for improved extrapolation and inference
title_fullStr Path-minimizing latent ODEs for improved extrapolation and inference
title_full_unstemmed Path-minimizing latent ODEs for improved extrapolation and inference
title_short Path-minimizing latent ODEs for improved extrapolation and inference
title_sort path minimizing latent odes for improved extrapolation and inference
topic latent
ODEs
regularization
url https://doi.org/10.1088/2632-2153/addc34
work_keys_str_mv AT mattlsampson pathminimizinglatentodesforimprovedextrapolationandinference
AT petermelchior pathminimizinglatentodesforimprovedextrapolationandinference