Bias Adaptive Statistical Inference Learning Agents for Learning from Human Feedback

We present a novel technique for learning behaviors from ahuman provided feedback signal that is distorted by system-atic bias. Our technique, which we refer to as BASIL, modelsthe feedback signal as being separable into a heuristic evalu-ation of the utility of an action and a bias value that is dr...

Full description

Saved in:

Bibliographic Details
Main Author:	Jonathan I Watson
Format:	Article
Language:	English
Published:	LibraryPress@UF 2021-04-01
Series:	Proceedings of the International Florida Artificial Intelligence Research Society Conference
Subjects:	interactive machine learning iml interactive reinforcement learning irl bias human factors bayesian tamer tetris
Online Access:	https://journals.flvc.org/FLAIRS/article/view/128471
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1850271137021296640
author	Jonathan I Watson
author_facet	Jonathan I Watson
author_sort	Jonathan I Watson
collection	DOAJ
description	We present a novel technique for learning behaviors from ahuman provided feedback signal that is distorted by system-atic bias. Our technique, which we refer to as BASIL, modelsthe feedback signal as being separable into a heuristic evalu-ation of the utility of an action and a bias value that is drawnfrom a parametric distribution probabilistically, where thedistribution is defined by unknown parameters. We presentthe general form of the technique as well as a specific algo-rithm for integrating the technique with the TAMER algo-rithm for bias values drawn from a normal distribution. Wetest our algorithm against standard TAMER in the domain ofTetris using a synthetic oracle that provides feedback undervarying levels of distortion. We find our algorithm can learnvery quickly under bias distortions that entirely stymie thelearning of classic TAMER.
format	Article
id	doaj-art-777708c61ce94946a7028a593ecae4e6
institution	OA Journals
issn	2334-0754 2334-0762
language	English
publishDate	2021-04-01
publisher	LibraryPress@UF
record_format	Article
series	Proceedings of the International Florida Artificial Intelligence Research Society Conference
spelling	doaj-art-777708c61ce94946a7028a593ecae4e62025-08-20T01:52:19ZengLibraryPress@UFProceedings of the International Florida Artificial Intelligence Research Society Conference2334-07542334-07622021-04-013410.32473/flairs.v34i1.12847162865Bias Adaptive Statistical Inference Learning Agents for Learning from Human FeedbackJonathan I Watson0University of KentuckyWe present a novel technique for learning behaviors from ahuman provided feedback signal that is distorted by system-atic bias. Our technique, which we refer to as BASIL, modelsthe feedback signal as being separable into a heuristic evalu-ation of the utility of an action and a bias value that is drawnfrom a parametric distribution probabilistically, where thedistribution is defined by unknown parameters. We presentthe general form of the technique as well as a specific algo-rithm for integrating the technique with the TAMER algo-rithm for bias values drawn from a normal distribution. Wetest our algorithm against standard TAMER in the domain ofTetris using a synthetic oracle that provides feedback undervarying levels of distortion. We find our algorithm can learnvery quickly under bias distortions that entirely stymie thelearning of classic TAMER.https://journals.flvc.org/FLAIRS/article/view/128471interactive machine learningimlinteractive reinforcement learningirlbiashuman factorsbayesiantamertetris
spellingShingle	Jonathan I Watson Bias Adaptive Statistical Inference Learning Agents for Learning from Human Feedback Proceedings of the International Florida Artificial Intelligence Research Society Conference interactive machine learning iml interactive reinforcement learning irl bias human factors bayesian tamer tetris
title	Bias Adaptive Statistical Inference Learning Agents for Learning from Human Feedback
title_full	Bias Adaptive Statistical Inference Learning Agents for Learning from Human Feedback
title_fullStr	Bias Adaptive Statistical Inference Learning Agents for Learning from Human Feedback
title_full_unstemmed	Bias Adaptive Statistical Inference Learning Agents for Learning from Human Feedback
title_short	Bias Adaptive Statistical Inference Learning Agents for Learning from Human Feedback
title_sort	bias adaptive statistical inference learning agents for learning from human feedback
topic	interactive machine learning iml interactive reinforcement learning irl bias human factors bayesian tamer tetris
url	https://journals.flvc.org/FLAIRS/article/view/128471
work_keys_str_mv	AT jonathaniwatson biasadaptivestatisticalinferencelearningagentsforlearningfromhumanfeedback

Bias Adaptive Statistical Inference Learning Agents for Learning from Human Feedback

Similar Items