When Text and Speech are Not Enough: A Multimodal Dataset of Collaboration in a Situated Task

To adequately model information exchanged in real human-human interactions, considering speech or text alone leaves out many critical modalities. The channels contributing to the “making of sense” in human-human interactions include but are not limited to gesture, speech, user-interaction modeling,...

Full description

Saved in:
Bibliographic Details
Main Authors: Ibrahim Khebour, Richard Brutti, Indrani Dey, Rachel Dickler, Kelsey Sikes, Kenneth Lai, Mariah Bradford, Brittany Cates, Paige Hansen, Changsoo Jung, Brett Wisniewski, Corbyn Terpstra, Leanne Hirshfield, Sadhana Puntambekar, Nathaniel Blanchard, James Pustejovsky, Nikhil Krishnaswamy
Format: Article
Language:English
Published: Ubiquity Press 2024-01-01
Series:Journal of Open Humanities Data
Subjects:
Online Access:https://account.openhumanitiesdata.metajnl.com/index.php/up-j-johd/article/view/168
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850166888351399936
author Ibrahim Khebour
Richard Brutti
Indrani Dey
Rachel Dickler
Kelsey Sikes
Kenneth Lai
Mariah Bradford
Brittany Cates
Paige Hansen
Changsoo Jung
Brett Wisniewski
Corbyn Terpstra
Leanne Hirshfield
Sadhana Puntambekar
Nathaniel Blanchard
James Pustejovsky
Nikhil Krishnaswamy
author_facet Ibrahim Khebour
Richard Brutti
Indrani Dey
Rachel Dickler
Kelsey Sikes
Kenneth Lai
Mariah Bradford
Brittany Cates
Paige Hansen
Changsoo Jung
Brett Wisniewski
Corbyn Terpstra
Leanne Hirshfield
Sadhana Puntambekar
Nathaniel Blanchard
James Pustejovsky
Nikhil Krishnaswamy
author_sort Ibrahim Khebour
collection DOAJ
description To adequately model information exchanged in real human-human interactions, considering speech or text alone leaves out many critical modalities. The channels contributing to the “making of sense” in human-human interactions include but are not limited to gesture, speech, user-interaction modeling, gaze, joint attention, and involvement/engagement, all of which need to be adequately modeled to automatically extract correct and meaningful information. In this paper, we present a multimodal dataset of a novel situated and shared collaborative task, with the above channels annotated to encode these different aspects of the situated and embodied involvement of the participants in the joint activity.
format Article
id doaj-art-9ef46dac551c4ca5b7706fe2295eec64
institution OA Journals
issn 2059-481X
language English
publishDate 2024-01-01
publisher Ubiquity Press
record_format Article
series Journal of Open Humanities Data
spelling doaj-art-9ef46dac551c4ca5b7706fe2295eec642025-08-20T02:21:20ZengUbiquity PressJournal of Open Humanities Data2059-481X2024-01-01107710.5334/johd.168168When Text and Speech are Not Enough: A Multimodal Dataset of Collaboration in a Situated TaskIbrahim Khebour0https://orcid.org/0009-0009-4374-7263Richard Brutti1https://orcid.org/0000-0003-0449-4418Indrani Dey2https://orcid.org/0009-0000-9284-6078Rachel Dickler3https://orcid.org/0000-0002-9018-4848Kelsey Sikes4https://orcid.org/0009-0003-9711-920XKenneth Lai5https://orcid.org/0000-0003-2870-7019Mariah Bradford6https://orcid.org/0009-0009-2162-3307Brittany Cates7https://orcid.org/0009-0000-4169-0616Paige Hansen8https://orcid.org/0009-0009-7350-2312Changsoo Jung9https://orcid.org/0000-0002-2232-4300Brett Wisniewski10https://orcid.org/0009-0005-1236-069XCorbyn Terpstra11https://orcid.org/0009-0006-7005-8437Leanne Hirshfield12https://orcid.org/0000-0003-0111-6948Sadhana Puntambekar13https://orcid.org/0000-0002-7102-0127Nathaniel Blanchard14https://orcid.org/0000-0002-2653-0873James Pustejovsky15https://orcid.org/0000-0003-2233-9761Nikhil Krishnaswamy16https://orcid.org/0000-0001-7878-7227Department of Computer Science, Colorado State University, Fort Collins, CODepartment of Computer Science, Brandeis University, Waltham, MADepartment of Educational Psychology, University of Wisconsin – Madison, Madison, WIInstitute of Cognitive Science, University of Colorado, Boulder, CODepartment of Computer Science, Colorado State University, Fort Collins, CODepartment of Computer Science, Brandeis University, Waltham, MADepartment of Computer Science, Colorado State University, Fort Collins, CODepartment of Computer Science, Colorado State University, Fort Collins, CODepartment of Computer Science, Colorado State University, Fort Collins, CODepartment of Computer Science, Colorado State University, Fort Collins, CODepartment of Computer Science, Colorado State University, Fort Collins, CODepartment of Computer Science, Colorado State University, Fort Collins, COInstitute of Cognitive Science, University of Colorado, Boulder, CODepartment of Educational Psychology, University of Wisconsin – Madison, Madison, WIDepartment of Computer Science, Colorado State University, Fort Collins, CODepartment of Computer Science, Brandeis University, Waltham, MADepartment of Computer Science, Colorado State University, Fort Collins, COTo adequately model information exchanged in real human-human interactions, considering speech or text alone leaves out many critical modalities. The channels contributing to the “making of sense” in human-human interactions include but are not limited to gesture, speech, user-interaction modeling, gaze, joint attention, and involvement/engagement, all of which need to be adequately modeled to automatically extract correct and meaningful information. In this paper, we present a multimodal dataset of a novel situated and shared collaborative task, with the above channels annotated to encode these different aspects of the situated and embodied involvement of the participants in the joint activity.https://account.openhumanitiesdata.metajnl.com/index.php/up-j-johd/article/view/168multimodal interactioncollaborationproblem solvingsituated tasks
spellingShingle Ibrahim Khebour
Richard Brutti
Indrani Dey
Rachel Dickler
Kelsey Sikes
Kenneth Lai
Mariah Bradford
Brittany Cates
Paige Hansen
Changsoo Jung
Brett Wisniewski
Corbyn Terpstra
Leanne Hirshfield
Sadhana Puntambekar
Nathaniel Blanchard
James Pustejovsky
Nikhil Krishnaswamy
When Text and Speech are Not Enough: A Multimodal Dataset of Collaboration in a Situated Task
Journal of Open Humanities Data
multimodal interaction
collaboration
problem solving
situated tasks
title When Text and Speech are Not Enough: A Multimodal Dataset of Collaboration in a Situated Task
title_full When Text and Speech are Not Enough: A Multimodal Dataset of Collaboration in a Situated Task
title_fullStr When Text and Speech are Not Enough: A Multimodal Dataset of Collaboration in a Situated Task
title_full_unstemmed When Text and Speech are Not Enough: A Multimodal Dataset of Collaboration in a Situated Task
title_short When Text and Speech are Not Enough: A Multimodal Dataset of Collaboration in a Situated Task
title_sort when text and speech are not enough a multimodal dataset of collaboration in a situated task
topic multimodal interaction
collaboration
problem solving
situated tasks
url https://account.openhumanitiesdata.metajnl.com/index.php/up-j-johd/article/view/168
work_keys_str_mv AT ibrahimkhebour whentextandspeecharenotenoughamultimodaldatasetofcollaborationinasituatedtask
AT richardbrutti whentextandspeecharenotenoughamultimodaldatasetofcollaborationinasituatedtask
AT indranidey whentextandspeecharenotenoughamultimodaldatasetofcollaborationinasituatedtask
AT racheldickler whentextandspeecharenotenoughamultimodaldatasetofcollaborationinasituatedtask
AT kelseysikes whentextandspeecharenotenoughamultimodaldatasetofcollaborationinasituatedtask
AT kennethlai whentextandspeecharenotenoughamultimodaldatasetofcollaborationinasituatedtask
AT mariahbradford whentextandspeecharenotenoughamultimodaldatasetofcollaborationinasituatedtask
AT brittanycates whentextandspeecharenotenoughamultimodaldatasetofcollaborationinasituatedtask
AT paigehansen whentextandspeecharenotenoughamultimodaldatasetofcollaborationinasituatedtask
AT changsoojung whentextandspeecharenotenoughamultimodaldatasetofcollaborationinasituatedtask
AT brettwisniewski whentextandspeecharenotenoughamultimodaldatasetofcollaborationinasituatedtask
AT corbynterpstra whentextandspeecharenotenoughamultimodaldatasetofcollaborationinasituatedtask
AT leannehirshfield whentextandspeecharenotenoughamultimodaldatasetofcollaborationinasituatedtask
AT sadhanapuntambekar whentextandspeecharenotenoughamultimodaldatasetofcollaborationinasituatedtask
AT nathanielblanchard whentextandspeecharenotenoughamultimodaldatasetofcollaborationinasituatedtask
AT jamespustejovsky whentextandspeecharenotenoughamultimodaldatasetofcollaborationinasituatedtask
AT nikhilkrishnaswamy whentextandspeecharenotenoughamultimodaldatasetofcollaborationinasituatedtask