A criterion for selecting the appropriate one from the trained models for model‐based offline policy evaluation
Abstract Offline policy evaluation, evaluating and selecting complex policies for decision‐making by only using offline datasets is important in reinforcement learning. At present, the model‐based offline policy evaluation (MBOPE) is widely welcomed because of its easy to implement and good performa...
Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Wiley
2025-02-01
|
| Series: | CAAI Transactions on Intelligence Technology |
| Subjects: | |
| Online Access: | https://doi.org/10.1049/cit2.12376 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|