A criterion for selecting the appropriate one from the trained models for model‐based offline policy evaluation

Abstract Offline policy evaluation, evaluating and selecting complex policies for decision‐making by only using offline datasets is important in reinforcement learning. At present, the model‐based offline policy evaluation (MBOPE) is widely welcomed because of its easy to implement and good performa...

Full description

Saved in:
Bibliographic Details
Main Authors: Chongchong Li, Yue Wang, Zhi‐Ming Ma, Yuting Liu
Format: Article
Language:English
Published: Wiley 2025-02-01
Series:CAAI Transactions on Intelligence Technology
Subjects:
Online Access:https://doi.org/10.1049/cit2.12376
Tags: Add Tag
No Tags, Be the first to tag this record!