Online Attentive Kernel-Based Off-Policy Temporal Difference Learning
Temporal difference (TD) learning is a powerful framework for value function approximation in reinforcement learning. However, standard TD methods often struggle with feature representation and off-policy learning challenges. In this paper, we propose a novel framework, online attentive kernel-based...
Saved in:
| Main Authors: | Shangdong Yang, Shuaiqiang Zhang, Xingguo Chen |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2024-11-01
|
| Series: | Applied Sciences |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2076-3417/14/23/11114 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
-
A Robust System for Super-Resolution Imaging in Remote Sensing via Attention-Based Residual Learning
by: Rogelio Reyes-Reyes, et al.
Published: (2025-07-01) -
Hyperspectral Image Reconstruction Based on Blur–Kernel–Prior and Spatial–Spectral Attention
by: Hongyu Xie, et al.
Published: (2025-04-01) -
Reinforcement Learning with Multi-Policy Movement Strategy for Weakly Supervised Temporal Sentence Grounding
by: Shan Jiang, et al.
Published: (2024-10-01) -
CDUNeXt: efficient ossification segmentation with large kernel and dual cross gate attention
by: Hailiang Xia, et al.
Published: (2024-12-01) -
Z-Score Experience Replay in Off-Policy Deep Reinforcement Learning
by: Yana Yang, et al.
Published: (2024-12-01)