Safe Switching Model-Free Value Iteration for General Nonlinear Systems

This paper presents a solution to the well-known challenge of ensuring safety guarantees during the evaluation of controllers tuned using Value Iteration (VI) techniques on real systems. We propose an approach called Safe Switching Model-Free Value Iteration (SSMFVI), which guarantees both stability...

Full description

Saved in:
Bibliographic Details
Main Author: Timotei Lala
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10988815/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849771882364010496
author Timotei Lala
author_facet Timotei Lala
author_sort Timotei Lala
collection DOAJ
description This paper presents a solution to the well-known challenge of ensuring safety guarantees during the evaluation of controllers tuned using Value Iteration (VI) techniques on real systems. We propose an approach called Safe Switching Model-Free Value Iteration (SSMFVI), which guarantees both stability and safety of a system in closed loop with a controller optimized with Value Iteration in a model-free manner. A state dependent switching rule is designed to alternate between the VI tuned controller and an initial known stabilizing admissible controller. Using the initial controller Q-function and the one continuously actualized during learning, the stability of the switching mechanism in closed loop with the system is derived using the Multiple Lyapunov Functions (MLF) framework. To guarantee safety during runtime operation, the systems maximum one-step transition is estimated. Then, the switching control signal is designed to select the MFVI controller only in the region of the state space both covered by the collected transitions during the exploration phase and with the distance to the unsafe set greater than the computed maximum one-step transition. This subset of the state space is determined using single-class Support Vector Machine (SVM) classification. The method includes mechanisms for early instability detection and chattering reduction near switching surfaces. The validation is conducted on a linear first order system for visualization of the results and on a real Electric Braking System circuit system, demonstrating the effectiveness of the proposed control method.
format Article
id doaj-art-45d177b1d3ab4ab09049e63a3b41d39c
institution DOAJ
issn 2169-3536
language English
publishDate 2025-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj-art-45d177b1d3ab4ab09049e63a3b41d39c2025-08-20T03:02:29ZengIEEEIEEE Access2169-35362025-01-011313217013219310.1109/ACCESS.2025.356749610988815Safe Switching Model-Free Value Iteration for General Nonlinear SystemsTimotei Lala0https://orcid.org/0000-0002-6682-5609Department of Automation and Applied Informatics, Faculty of Automation and Computers, Politehnica University of Timisoara, Timisoara, RomaniaThis paper presents a solution to the well-known challenge of ensuring safety guarantees during the evaluation of controllers tuned using Value Iteration (VI) techniques on real systems. We propose an approach called Safe Switching Model-Free Value Iteration (SSMFVI), which guarantees both stability and safety of a system in closed loop with a controller optimized with Value Iteration in a model-free manner. A state dependent switching rule is designed to alternate between the VI tuned controller and an initial known stabilizing admissible controller. Using the initial controller Q-function and the one continuously actualized during learning, the stability of the switching mechanism in closed loop with the system is derived using the Multiple Lyapunov Functions (MLF) framework. To guarantee safety during runtime operation, the systems maximum one-step transition is estimated. Then, the switching control signal is designed to select the MFVI controller only in the region of the state space both covered by the collected transitions during the exploration phase and with the distance to the unsafe set greater than the computed maximum one-step transition. This subset of the state space is determined using single-class Support Vector Machine (SVM) classification. The method includes mechanisms for early instability detection and chattering reduction near switching surfaces. The validation is conducted on a linear first order system for visualization of the results and on a real Electric Braking System circuit system, demonstrating the effectiveness of the proposed control method.https://ieeexplore.ieee.org/document/10988815/Adaptive dynamic programmingmodel-free controloptimal controlQ-learningsafe model-freeswitching control
spellingShingle Timotei Lala
Safe Switching Model-Free Value Iteration for General Nonlinear Systems
IEEE Access
Adaptive dynamic programming
model-free control
optimal control
Q-learning
safe model-free
switching control
title Safe Switching Model-Free Value Iteration for General Nonlinear Systems
title_full Safe Switching Model-Free Value Iteration for General Nonlinear Systems
title_fullStr Safe Switching Model-Free Value Iteration for General Nonlinear Systems
title_full_unstemmed Safe Switching Model-Free Value Iteration for General Nonlinear Systems
title_short Safe Switching Model-Free Value Iteration for General Nonlinear Systems
title_sort safe switching model free value iteration for general nonlinear systems
topic Adaptive dynamic programming
model-free control
optimal control
Q-learning
safe model-free
switching control
url https://ieeexplore.ieee.org/document/10988815/
work_keys_str_mv AT timoteilala safeswitchingmodelfreevalueiterationforgeneralnonlinearsystems