Enhanced Reward Function Design for Source Term Estimation Based on Deep Reinforcement Learning

This study investigates the design of reward functions for deep reinforcement learning-based source term estimation (STE). Estimating the properties of unknown hazardous gas leakage using a mobile sensor, known as STE problems, is challenging due to environmental turbulence and sensor noise. To addr...

Full description

Saved in:

Bibliographic Details
Main Authors:	Junhee Lee, Hongro Jang, Minkyu Park, Hyondong Oh
Format:	Article
Language:	English
Published:	IEEE 2025-01-01
Series:	IEEE Access
Subjects:	Source term estimation deep reinforcement learning deep Q-network reward function Bayesian inference particle filter
Online Access:	https://ieeexplore.ieee.org/document/11004010/
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	This study investigates the design of reward functions for deep reinforcement learning-based source term estimation (STE). Estimating the properties of unknown hazardous gas leakage using a mobile sensor, known as STE problems, is challenging due to environmental turbulence and sensor noise. To address this issue, the particle filter is employed to estimate the source term under noisy sensor measurements, and the deep Q-network is used to find the optimal source search policy. In deep reinforcement learning, selecting an appropriate reward function is crucial as it directly impacts the learning performance. Specifically, this paper first reviews existing reward functions based on penalty, distance, concentration, and entropy metrics. To overcome the limitations of existing rewards, we combine their strengths and propose new reward functions such as the Gaussian mixture model (GMM) variance-based reward and the GMM information gain-based reward. To validate the robustness of the proposed approach, simulations are conducted in two types of environments: basic and turbulent, by adjusting the parameters of the noise condition. The simulation results demonstrate that the proposed reward functions outperform existing ones and are particularly robust in noisy environments.
ISSN:	2169-3536

Enhanced Reward Function Design for Source Term Estimation Based on Deep Reinforcement Learning

Similar Items