Path-minimizing latent ODEs for improved extrapolation and inference
Latent ordinary differential equation (ODE) models provide flexible descriptions of dynamic systems, but they can struggle with extrapolation and predicting complicated non-linear dynamics. The latent ODE approach implicitly relies on encoders to identify unknown system parameters and initial condit...
Saved in:
| Main Authors: | , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
IOP Publishing
2025-01-01
|
| Series: | Machine Learning: Science and Technology |
| Subjects: | |
| Online Access: | https://doi.org/10.1088/2632-2153/addc34 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849688951217979392 |
|---|---|
| author | Matt L Sampson Peter Melchior |
| author_facet | Matt L Sampson Peter Melchior |
| author_sort | Matt L Sampson |
| collection | DOAJ |
| description | Latent ordinary differential equation (ODE) models provide flexible descriptions of dynamic systems, but they can struggle with extrapolation and predicting complicated non-linear dynamics. The latent ODE approach implicitly relies on encoders to identify unknown system parameters and initial conditions (ICs), whereas the evaluation times are known and directly provided to the ODE solver. This dichotomy can be exploited by encouraging time-independent latent representations. By replacing the common variational penalty in latent space with an $\ell_2$ penalty on the path length of each system, the models learn data representations that can easily be distinguished from those of systems with different configurations. This results in faster training, smaller models, more accurate interpolation and long-time extrapolation compared to the baseline ODE models with a gated recurent unit (GRU), recurrent neural network, and long-short-term-memory encoder/decoders on tests with damped harmonic oscillator, self-gravitating fluid, and predator-prey systems. We also demonstrate superior results for simulation-based inference of the Lotka–Volterra parameters and ICs by using the latents as data summaries for a conditional normalizing flow. Our change to the training loss is agnostic to the specific recognition network used by the decoder and can therefore easily be adopted by other latent ODE models. |
| format | Article |
| id | doaj-art-dee30643101c468bacb5636b77441ead |
| institution | DOAJ |
| issn | 2632-2153 |
| language | English |
| publishDate | 2025-01-01 |
| publisher | IOP Publishing |
| record_format | Article |
| series | Machine Learning: Science and Technology |
| spelling | doaj-art-dee30643101c468bacb5636b77441ead2025-08-20T03:21:47ZengIOP PublishingMachine Learning: Science and Technology2632-21532025-01-016202504710.1088/2632-2153/addc34Path-minimizing latent ODEs for improved extrapolation and inferenceMatt L Sampson0https://orcid.org/0000-0001-5748-5393Peter Melchior1https://orcid.org/0000-0002-8873-5065Department of Astrophysical Sciences, Princeton University , Princeton, NJ 08544, United States of AmericaDepartment of Astrophysical Sciences, Princeton University , Princeton, NJ 08544, United States of America; Center for Statistics and Machine Learning, Princeton University , Princeton, NJ 08544, United States of AmericaLatent ordinary differential equation (ODE) models provide flexible descriptions of dynamic systems, but they can struggle with extrapolation and predicting complicated non-linear dynamics. The latent ODE approach implicitly relies on encoders to identify unknown system parameters and initial conditions (ICs), whereas the evaluation times are known and directly provided to the ODE solver. This dichotomy can be exploited by encouraging time-independent latent representations. By replacing the common variational penalty in latent space with an $\ell_2$ penalty on the path length of each system, the models learn data representations that can easily be distinguished from those of systems with different configurations. This results in faster training, smaller models, more accurate interpolation and long-time extrapolation compared to the baseline ODE models with a gated recurent unit (GRU), recurrent neural network, and long-short-term-memory encoder/decoders on tests with damped harmonic oscillator, self-gravitating fluid, and predator-prey systems. We also demonstrate superior results for simulation-based inference of the Lotka–Volterra parameters and ICs by using the latents as data summaries for a conditional normalizing flow. Our change to the training loss is agnostic to the specific recognition network used by the decoder and can therefore easily be adopted by other latent ODE models.https://doi.org/10.1088/2632-2153/addc34latentODEsregularization |
| spellingShingle | Matt L Sampson Peter Melchior Path-minimizing latent ODEs for improved extrapolation and inference Machine Learning: Science and Technology latent ODEs regularization |
| title | Path-minimizing latent ODEs for improved extrapolation and inference |
| title_full | Path-minimizing latent ODEs for improved extrapolation and inference |
| title_fullStr | Path-minimizing latent ODEs for improved extrapolation and inference |
| title_full_unstemmed | Path-minimizing latent ODEs for improved extrapolation and inference |
| title_short | Path-minimizing latent ODEs for improved extrapolation and inference |
| title_sort | path minimizing latent odes for improved extrapolation and inference |
| topic | latent ODEs regularization |
| url | https://doi.org/10.1088/2632-2153/addc34 |
| work_keys_str_mv | AT mattlsampson pathminimizinglatentodesforimprovedextrapolationandinference AT petermelchior pathminimizinglatentodesforimprovedextrapolationandinference |