Bias Adaptive Statistical Inference Learning Agents for Learning from Human Feedback
We present a novel technique for learning behaviors from ahuman provided feedback signal that is distorted by system-atic bias. Our technique, which we refer to as BASIL, modelsthe feedback signal as being separable into a heuristic evalu-ation of the utility of an action and a bias value that is dr...
Saved in:
| Main Author: | |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
LibraryPress@UF
2021-04-01
|
| Series: | Proceedings of the International Florida Artificial Intelligence Research Society Conference |
| Subjects: | |
| Online Access: | https://journals.flvc.org/FLAIRS/article/view/128471 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850271137021296640 |
|---|---|
| author | Jonathan I Watson |
| author_facet | Jonathan I Watson |
| author_sort | Jonathan I Watson |
| collection | DOAJ |
| description | We present a novel technique for learning behaviors from ahuman provided feedback signal that is distorted by system-atic bias. Our technique, which we refer to as BASIL, modelsthe feedback signal as being separable into a heuristic evalu-ation of the utility of an action and a bias value that is drawnfrom a parametric distribution probabilistically, where thedistribution is defined by unknown parameters. We presentthe general form of the technique as well as a specific algo-rithm for integrating the technique with the TAMER algo-rithm for bias values drawn from a normal distribution. Wetest our algorithm against standard TAMER in the domain ofTetris using a synthetic oracle that provides feedback undervarying levels of distortion. We find our algorithm can learnvery quickly under bias distortions that entirely stymie thelearning of classic TAMER. |
| format | Article |
| id | doaj-art-777708c61ce94946a7028a593ecae4e6 |
| institution | OA Journals |
| issn | 2334-0754 2334-0762 |
| language | English |
| publishDate | 2021-04-01 |
| publisher | LibraryPress@UF |
| record_format | Article |
| series | Proceedings of the International Florida Artificial Intelligence Research Society Conference |
| spelling | doaj-art-777708c61ce94946a7028a593ecae4e62025-08-20T01:52:19ZengLibraryPress@UFProceedings of the International Florida Artificial Intelligence Research Society Conference2334-07542334-07622021-04-013410.32473/flairs.v34i1.12847162865Bias Adaptive Statistical Inference Learning Agents for Learning from Human FeedbackJonathan I Watson0University of KentuckyWe present a novel technique for learning behaviors from ahuman provided feedback signal that is distorted by system-atic bias. Our technique, which we refer to as BASIL, modelsthe feedback signal as being separable into a heuristic evalu-ation of the utility of an action and a bias value that is drawnfrom a parametric distribution probabilistically, where thedistribution is defined by unknown parameters. We presentthe general form of the technique as well as a specific algo-rithm for integrating the technique with the TAMER algo-rithm for bias values drawn from a normal distribution. Wetest our algorithm against standard TAMER in the domain ofTetris using a synthetic oracle that provides feedback undervarying levels of distortion. We find our algorithm can learnvery quickly under bias distortions that entirely stymie thelearning of classic TAMER.https://journals.flvc.org/FLAIRS/article/view/128471interactive machine learningimlinteractive reinforcement learningirlbiashuman factorsbayesiantamertetris |
| spellingShingle | Jonathan I Watson Bias Adaptive Statistical Inference Learning Agents for Learning from Human Feedback Proceedings of the International Florida Artificial Intelligence Research Society Conference interactive machine learning iml interactive reinforcement learning irl bias human factors bayesian tamer tetris |
| title | Bias Adaptive Statistical Inference Learning Agents for Learning from Human Feedback |
| title_full | Bias Adaptive Statistical Inference Learning Agents for Learning from Human Feedback |
| title_fullStr | Bias Adaptive Statistical Inference Learning Agents for Learning from Human Feedback |
| title_full_unstemmed | Bias Adaptive Statistical Inference Learning Agents for Learning from Human Feedback |
| title_short | Bias Adaptive Statistical Inference Learning Agents for Learning from Human Feedback |
| title_sort | bias adaptive statistical inference learning agents for learning from human feedback |
| topic | interactive machine learning iml interactive reinforcement learning irl bias human factors bayesian tamer tetris |
| url | https://journals.flvc.org/FLAIRS/article/view/128471 |
| work_keys_str_mv | AT jonathaniwatson biasadaptivestatisticalinferencelearningagentsforlearningfromhumanfeedback |