The Finite-Time Turnpike Property in Machine Learning

The finite-time turnpike property describes the situation in an optimal control problem where an optimal trajectory reaches the desired state before the end of the time interval and remains there. We consider a machine learning problem with a neural ordinary differential equation that can be seen as...

Full description

Saved in:
Bibliographic Details
Main Author: Martin Gugat
Format: Article
Language:English
Published: MDPI AG 2024-10-01
Series:Machines
Subjects:
Online Access:https://www.mdpi.com/2075-1702/12/10/705
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The finite-time turnpike property describes the situation in an optimal control problem where an optimal trajectory reaches the desired state before the end of the time interval and remains there. We consider a machine learning problem with a neural ordinary differential equation that can be seen as a homogenization of a deep ResNet. We show that with the appropriate scaling of the quadratic control cost and the non-smooth tracking term, the optimal control problem has the finite-time turnpike property; that is, the desired state is reached within the time interval and the optimal state remains there until the terminal time <i>T</i>. The time <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msub><mi>t</mi><mn>0</mn></msub></semantics></math></inline-formula> where the optimal trajectories reach the desired state can serve as an additional design parameter. Since ResNets can be viewed as discretizations of neural odes, the choice of <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msub><mi>t</mi><mn>0</mn></msub></semantics></math></inline-formula> corresponds to the choice of the number of layers; that is, the depth of the neural network. The choice of <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msub><mi>t</mi><mn>0</mn></msub></semantics></math></inline-formula> allows us to achieve a compromise between the depth of the network and the size of the optimal system parameters, which we hope will be useful to determine the optimal depths for neural network architectures in the future.
ISSN:2075-1702