Optimization of fluid control laws through deep reinforcement learning using dynamic mode decomposition as the environment

The optimization of fluid control laws through deep reinforcement learning (DRL) presents a challenge owing to the considerable computational costs associated with trial-and-error processes. In this study, we examine the feasibility of deriving an effective control law using a reduced-order model co...

Full description

Saved in:

Bibliographic Details
Main Authors:	T. Sakamoto, K. Okabayashi
Format:	Article
Language:	English
Published:	AIP Publishing LLC 2024-11-01
Series:	AIP Advances
Online Access:	http://dx.doi.org/10.1063/5.0237682
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1850137689157795840
author	T. Sakamoto K. Okabayashi
author_facet	T. Sakamoto K. Okabayashi
author_sort	T. Sakamoto
collection	DOAJ
description	The optimization of fluid control laws through deep reinforcement learning (DRL) presents a challenge owing to the considerable computational costs associated with trial-and-error processes. In this study, we examine the feasibility of deriving an effective control law using a reduced-order model constructed by dynamic mode decomposition with control (DMDc). DMDc is a method of modal analysis of a flow field that incorporates external inputs, and we utilize it to represent the time development of flow in the DRL environment. We also examine the amount of computation time saved by this method. We adopt the optimization problem of the control law for managing lift fluctuations caused by the Kármán vortex shedding in the flow around a cylinder. The deep deterministic policy gradient is used as the DRL algorithm. The external input for the DMDc model consists of a superposition of the chirp signal, containing various amplitudes and frequencies, and random noise. This combination is used to express random actions during the exploration phase. With DRL in a DMDc environment, a control law that exceeds the performance of conventional mathematical control is derived, although the learning is unstable (not converged). This lack of convergence is also observed with DRL in a computational fluid dynamics (CFD) environment. However, when the number of learning epochs is the same, a superior control law is obtained with DRL in a DMDc environment. This outcome could be attributed to the DMDc representation of the flow field, which tends to smooth out high-frequency fluctuations even when subjected to signals of larger amplitude. In addition, using DMDc results in a computation time savings of up to a factor of 3 compared to using CFD.
format	Article
id	doaj-art-330fb9e9baf046e3b7cf2a020fbdba37
institution	OA Journals
issn	2158-3226
language	English
publishDate	2024-11-01
publisher	AIP Publishing LLC
record_format	Article
series	AIP Advances
spelling	doaj-art-330fb9e9baf046e3b7cf2a020fbdba372025-08-20T02:30:46ZengAIP Publishing LLCAIP Advances2158-32262024-11-011411115204115204-1710.1063/5.0237682Optimization of fluid control laws through deep reinforcement learning using dynamic mode decomposition as the environmentT. Sakamoto0K. Okabayashi1Department of Mechanical Engineering, Osaka University, 2-1 Yamadaoka, Suita, Osaka, JapanDepartment of Mechanical Engineering, Osaka University, 2-1 Yamadaoka, Suita, Osaka, JapanThe optimization of fluid control laws through deep reinforcement learning (DRL) presents a challenge owing to the considerable computational costs associated with trial-and-error processes. In this study, we examine the feasibility of deriving an effective control law using a reduced-order model constructed by dynamic mode decomposition with control (DMDc). DMDc is a method of modal analysis of a flow field that incorporates external inputs, and we utilize it to represent the time development of flow in the DRL environment. We also examine the amount of computation time saved by this method. We adopt the optimization problem of the control law for managing lift fluctuations caused by the Kármán vortex shedding in the flow around a cylinder. The deep deterministic policy gradient is used as the DRL algorithm. The external input for the DMDc model consists of a superposition of the chirp signal, containing various amplitudes and frequencies, and random noise. This combination is used to express random actions during the exploration phase. With DRL in a DMDc environment, a control law that exceeds the performance of conventional mathematical control is derived, although the learning is unstable (not converged). This lack of convergence is also observed with DRL in a computational fluid dynamics (CFD) environment. However, when the number of learning epochs is the same, a superior control law is obtained with DRL in a DMDc environment. This outcome could be attributed to the DMDc representation of the flow field, which tends to smooth out high-frequency fluctuations even when subjected to signals of larger amplitude. In addition, using DMDc results in a computation time savings of up to a factor of 3 compared to using CFD.http://dx.doi.org/10.1063/5.0237682
spellingShingle	T. Sakamoto K. Okabayashi Optimization of fluid control laws through deep reinforcement learning using dynamic mode decomposition as the environment AIP Advances
title	Optimization of fluid control laws through deep reinforcement learning using dynamic mode decomposition as the environment
title_full	Optimization of fluid control laws through deep reinforcement learning using dynamic mode decomposition as the environment
title_fullStr	Optimization of fluid control laws through deep reinforcement learning using dynamic mode decomposition as the environment
title_full_unstemmed	Optimization of fluid control laws through deep reinforcement learning using dynamic mode decomposition as the environment
title_short	Optimization of fluid control laws through deep reinforcement learning using dynamic mode decomposition as the environment
title_sort	optimization of fluid control laws through deep reinforcement learning using dynamic mode decomposition as the environment
url	http://dx.doi.org/10.1063/5.0237682
work_keys_str_mv	AT tsakamoto optimizationoffluidcontrollawsthroughdeepreinforcementlearningusingdynamicmodedecompositionastheenvironment AT kokabayashi optimizationoffluidcontrollawsthroughdeepreinforcementlearningusingdynamicmodedecompositionastheenvironment

Optimization of fluid control laws through deep reinforcement learning using dynamic mode decomposition as the environment

Similar Items