Towards an ‘Everything Corpus’: A Framework and Guidelines for the Curation of More Comprehensive Multimodal Music Data
Music information retrieval (MIR) is increasingly concerned with properly managing the complexity of musical data and the curation of high-quality multimodal datasets for use in a variety of computational tasks. This article presents (1) a conceptual framework for how practitioners interested in MIR...
Saved in:
| Main Authors: | , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Ubiquity Press
2025-05-01
|
| Series: | Transactions of the International Society for Music Information Retrieval |
| Subjects: | |
| Online Access: | https://account.transactions.ismir.net/index.php/up-j-tismir/article/view/228 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849469517458046976 |
|---|---|
| author | Mark Gotham Brian Bemman Igor Vatolkin |
| author_facet | Mark Gotham Brian Bemman Igor Vatolkin |
| author_sort | Mark Gotham |
| collection | DOAJ |
| description | Music information retrieval (MIR) is increasingly concerned with properly managing the complexity of musical data and the curation of high-quality multimodal datasets for use in a variety of computational tasks. This article presents (1) a conceptual framework for how practitioners interested in MIR—from musicians to scientists—can understand the multitude of modalities that constitute musical data and (2) a set of proposed guidelines for MIR researchers to consider when setting out to curate comprehensive, well-targeted, durable, and ethically sourced multimodal datasets. For (1), we identify 12 different themes of musical data divided into three, sequential phases further subdivided into five, narrow focus areas: (i) ‘before’ the music (leading to), (ii) the ‘actual’ music (itself and around it), and (iii) ‘after’ the music (uses of and responses to). For (2), we identify 17 specific quantitative, qualitative, and ethical criteria, informed by this conceptual framework and practices observed in existing multimodal datasets, for the eventual construction of an ‘Everything Corpus' for MIR research. |
| format | Article |
| id | doaj-art-8854e97f7c584b1fa02eaa8f6caab593 |
| institution | Kabale University |
| issn | 2514-3298 |
| language | English |
| publishDate | 2025-05-01 |
| publisher | Ubiquity Press |
| record_format | Article |
| series | Transactions of the International Society for Music Information Retrieval |
| spelling | doaj-art-8854e97f7c584b1fa02eaa8f6caab5932025-08-20T03:25:27ZengUbiquity PressTransactions of the International Society for Music Information Retrieval2514-32982025-05-018170–9270–9210.5334/tismir.228228Towards an ‘Everything Corpus’: A Framework and Guidelines for the Curation of More Comprehensive Multimodal Music DataMark Gotham0https://orcid.org/0000-0003-0722-3074Brian Bemman1https://orcid.org/0000-0001-7189-7896Igor Vatolkin2https://orcid.org/0000-0002-9454-9402King’s College London, LondonDurham University, DurhamRWTH Aachen University, AachenMusic information retrieval (MIR) is increasingly concerned with properly managing the complexity of musical data and the curation of high-quality multimodal datasets for use in a variety of computational tasks. This article presents (1) a conceptual framework for how practitioners interested in MIR—from musicians to scientists—can understand the multitude of modalities that constitute musical data and (2) a set of proposed guidelines for MIR researchers to consider when setting out to curate comprehensive, well-targeted, durable, and ethically sourced multimodal datasets. For (1), we identify 12 different themes of musical data divided into three, sequential phases further subdivided into five, narrow focus areas: (i) ‘before’ the music (leading to), (ii) the ‘actual’ music (itself and around it), and (iii) ‘after’ the music (uses of and responses to). For (2), we identify 17 specific quantitative, qualitative, and ethical criteria, informed by this conceptual framework and practices observed in existing multimodal datasets, for the eventual construction of an ‘Everything Corpus' for MIR research.https://account.transactions.ismir.net/index.php/up-j-tismir/article/view/228multimodalmusicinformation retrievaldatasetevaluationreview |
| spellingShingle | Mark Gotham Brian Bemman Igor Vatolkin Towards an ‘Everything Corpus’: A Framework and Guidelines for the Curation of More Comprehensive Multimodal Music Data Transactions of the International Society for Music Information Retrieval multimodal music information retrieval dataset evaluation review |
| title | Towards an ‘Everything Corpus’: A Framework and Guidelines for the Curation of More Comprehensive Multimodal Music Data |
| title_full | Towards an ‘Everything Corpus’: A Framework and Guidelines for the Curation of More Comprehensive Multimodal Music Data |
| title_fullStr | Towards an ‘Everything Corpus’: A Framework and Guidelines for the Curation of More Comprehensive Multimodal Music Data |
| title_full_unstemmed | Towards an ‘Everything Corpus’: A Framework and Guidelines for the Curation of More Comprehensive Multimodal Music Data |
| title_short | Towards an ‘Everything Corpus’: A Framework and Guidelines for the Curation of More Comprehensive Multimodal Music Data |
| title_sort | towards an everything corpus a framework and guidelines for the curation of more comprehensive multimodal music data |
| topic | multimodal music information retrieval dataset evaluation review |
| url | https://account.transactions.ismir.net/index.php/up-j-tismir/article/view/228 |
| work_keys_str_mv | AT markgotham towardsaneverythingcorpusaframeworkandguidelinesforthecurationofmorecomprehensivemultimodalmusicdata AT brianbemman towardsaneverythingcorpusaframeworkandguidelinesforthecurationofmorecomprehensivemultimodalmusicdata AT igorvatolkin towardsaneverythingcorpusaframeworkandguidelinesforthecurationofmorecomprehensivemultimodalmusicdata |