Garbage In, Flowers Out: Noisy Training Data Help Generative Models at Test Time

Despite important progress, conversational systems often generate dialogues that sound unnatural to humans. We conjecture that the reason lies in the different training and testing conditions: agents are trained in a controlled “lab” setting but tested in the “wild”. During training, they learn to u...

Full description

Saved in:
Bibliographic Details
Main Authors: Alberto Testoni, Raffaella Bernardi
Format: Article
Language:English
Published: Accademia University Press 2022-07-01
Series:IJCoL
Online Access:https://journals.openedition.org/ijcol/974
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850262675721814016
author Alberto Testoni
Raffaella Bernardi
author_facet Alberto Testoni
Raffaella Bernardi
author_sort Alberto Testoni
collection DOAJ
description Despite important progress, conversational systems often generate dialogues that sound unnatural to humans. We conjecture that the reason lies in the different training and testing conditions: agents are trained in a controlled “lab” setting but tested in the “wild”. During training, they learn to utter a sentence given the ground-truth dialogue history generated by human annotators. On the other hand, during testing, the agents must interact with each other, and hence deal with noisy data. We propose to fill this gap between the training and testing environments by training the model with mixed batches containing both samples of human and machine-generated dialogues. We assess the validity of the proposed method on GuessWhat?!, a visual referential game. We show that our method improves the linguistic quality of the generated dialogues, and it leads to higher accuracy of the guessing task; simple perturbations of the ground-truth dialogue history that mimic machine-generated data do not account for a similar improvement. Finally, we run a human evaluation experiment on a sample of machine-machine dialogues to complement the quantitative analysis. This experiment shows that also human annotators successfully exploit dialogues generated by a model trained with mixed batches to solve the task. Hence, the mixed-batch training does not cause a language drift. Moreover, we find that the new training regime allows human annotators to be significantly more confident when selecting the target object, showing that the generated dialogues are informative.
format Article
id doaj-art-e821619da35c44f3822652c3fc8dcdb9
institution OA Journals
issn 2499-4553
language English
publishDate 2022-07-01
publisher Accademia University Press
record_format Article
series IJCoL
spelling doaj-art-e821619da35c44f3822652c3fc8dcdb92025-08-20T01:55:08ZengAccademia University PressIJCoL2499-45532022-07-018110.4000/ijcol.974Garbage In, Flowers Out: Noisy Training Data Help Generative Models at Test TimeAlberto TestoniRaffaella BernardiDespite important progress, conversational systems often generate dialogues that sound unnatural to humans. We conjecture that the reason lies in the different training and testing conditions: agents are trained in a controlled “lab” setting but tested in the “wild”. During training, they learn to utter a sentence given the ground-truth dialogue history generated by human annotators. On the other hand, during testing, the agents must interact with each other, and hence deal with noisy data. We propose to fill this gap between the training and testing environments by training the model with mixed batches containing both samples of human and machine-generated dialogues. We assess the validity of the proposed method on GuessWhat?!, a visual referential game. We show that our method improves the linguistic quality of the generated dialogues, and it leads to higher accuracy of the guessing task; simple perturbations of the ground-truth dialogue history that mimic machine-generated data do not account for a similar improvement. Finally, we run a human evaluation experiment on a sample of machine-machine dialogues to complement the quantitative analysis. This experiment shows that also human annotators successfully exploit dialogues generated by a model trained with mixed batches to solve the task. Hence, the mixed-batch training does not cause a language drift. Moreover, we find that the new training regime allows human annotators to be significantly more confident when selecting the target object, showing that the generated dialogues are informative.https://journals.openedition.org/ijcol/974
spellingShingle Alberto Testoni
Raffaella Bernardi
Garbage In, Flowers Out: Noisy Training Data Help Generative Models at Test Time
IJCoL
title Garbage In, Flowers Out: Noisy Training Data Help Generative Models at Test Time
title_full Garbage In, Flowers Out: Noisy Training Data Help Generative Models at Test Time
title_fullStr Garbage In, Flowers Out: Noisy Training Data Help Generative Models at Test Time
title_full_unstemmed Garbage In, Flowers Out: Noisy Training Data Help Generative Models at Test Time
title_short Garbage In, Flowers Out: Noisy Training Data Help Generative Models at Test Time
title_sort garbage in flowers out noisy training data help generative models at test time
url https://journals.openedition.org/ijcol/974
work_keys_str_mv AT albertotestoni garbageinflowersoutnoisytrainingdatahelpgenerativemodelsattesttime
AT raffaellabernardi garbageinflowersoutnoisytrainingdatahelpgenerativemodelsattesttime