Natural sounds can be reconstructed from human neuroimaging data using deep neural network representation.

Reconstruction of perceptual experiences from brain activity offers a unique window into how population neural responses represent sensory information. Although decoding visual content from functional MRI (fMRI) has seen significant success, reconstructing arbitrary sounds remains challenging due to...

Full description

Saved in:
Bibliographic Details
Main Authors: Jong-Yun Park, Mitsuaki Tsukamoto, Misato Tanaka, Yukiyasu Kamitani
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2025-07-01
Series:PLoS Biology
Online Access:https://doi.org/10.1371/journal.pbio.3003293
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849240424572518400
author Jong-Yun Park
Mitsuaki Tsukamoto
Misato Tanaka
Yukiyasu Kamitani
author_facet Jong-Yun Park
Mitsuaki Tsukamoto
Misato Tanaka
Yukiyasu Kamitani
author_sort Jong-Yun Park
collection DOAJ
description Reconstruction of perceptual experiences from brain activity offers a unique window into how population neural responses represent sensory information. Although decoding visual content from functional MRI (fMRI) has seen significant success, reconstructing arbitrary sounds remains challenging due to the fine temporal structure of auditory signals and the coarse temporal resolution of fMRI. Drawing on the hierarchical auditory features of deep neural networks (DNNs) with progressively larger time windows and their neural activity correspondence, we introduce a method for sound reconstruction that integrates brain decoding of DNN features and an audio-generative model. DNN features decoded from auditory cortical activity outperformed spectrotemporal and modulation-based features, enabling perceptually plausible reconstructions across diverse sound categories. Behavioral evaluations and objective measures confirmed that these reconstructions preserved short-term spectral and perceptual properties, capturing the characteristic timbre of speech, animal calls, and musical instruments, while the reconstructed sounds did not reproduce longer temporal sequences with fidelity. Leave-category-out analyses indicated that the method generalizes across sound categories. Reconstructions at higher DNN layers and from early auditory regions revealed distinct contributions to decoding performance. Applying the model to a selective auditory attention ("cocktail party") task further showed that reconstructions reflected the attended sound more strongly than the unattended one in some of the subjects. Despite its inability to reconstruct exact temporal sequences, which may reflect the limited temporal resolution of fMRI, our framework demonstrates the feasibility of mapping brain activity to auditory experiences-a step toward more comprehensive understanding and reconstruction of internal auditory representations.
format Article
id doaj-art-4c68a97e22ef488298ef946b3341a0e0
institution Kabale University
issn 1544-9173
1545-7885
language English
publishDate 2025-07-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS Biology
spelling doaj-art-4c68a97e22ef488298ef946b3341a0e02025-08-20T04:00:34ZengPublic Library of Science (PLoS)PLoS Biology1544-91731545-78852025-07-01237e300329310.1371/journal.pbio.3003293Natural sounds can be reconstructed from human neuroimaging data using deep neural network representation.Jong-Yun ParkMitsuaki TsukamotoMisato TanakaYukiyasu KamitaniReconstruction of perceptual experiences from brain activity offers a unique window into how population neural responses represent sensory information. Although decoding visual content from functional MRI (fMRI) has seen significant success, reconstructing arbitrary sounds remains challenging due to the fine temporal structure of auditory signals and the coarse temporal resolution of fMRI. Drawing on the hierarchical auditory features of deep neural networks (DNNs) with progressively larger time windows and their neural activity correspondence, we introduce a method for sound reconstruction that integrates brain decoding of DNN features and an audio-generative model. DNN features decoded from auditory cortical activity outperformed spectrotemporal and modulation-based features, enabling perceptually plausible reconstructions across diverse sound categories. Behavioral evaluations and objective measures confirmed that these reconstructions preserved short-term spectral and perceptual properties, capturing the characteristic timbre of speech, animal calls, and musical instruments, while the reconstructed sounds did not reproduce longer temporal sequences with fidelity. Leave-category-out analyses indicated that the method generalizes across sound categories. Reconstructions at higher DNN layers and from early auditory regions revealed distinct contributions to decoding performance. Applying the model to a selective auditory attention ("cocktail party") task further showed that reconstructions reflected the attended sound more strongly than the unattended one in some of the subjects. Despite its inability to reconstruct exact temporal sequences, which may reflect the limited temporal resolution of fMRI, our framework demonstrates the feasibility of mapping brain activity to auditory experiences-a step toward more comprehensive understanding and reconstruction of internal auditory representations.https://doi.org/10.1371/journal.pbio.3003293
spellingShingle Jong-Yun Park
Mitsuaki Tsukamoto
Misato Tanaka
Yukiyasu Kamitani
Natural sounds can be reconstructed from human neuroimaging data using deep neural network representation.
PLoS Biology
title Natural sounds can be reconstructed from human neuroimaging data using deep neural network representation.
title_full Natural sounds can be reconstructed from human neuroimaging data using deep neural network representation.
title_fullStr Natural sounds can be reconstructed from human neuroimaging data using deep neural network representation.
title_full_unstemmed Natural sounds can be reconstructed from human neuroimaging data using deep neural network representation.
title_short Natural sounds can be reconstructed from human neuroimaging data using deep neural network representation.
title_sort natural sounds can be reconstructed from human neuroimaging data using deep neural network representation
url https://doi.org/10.1371/journal.pbio.3003293
work_keys_str_mv AT jongyunpark naturalsoundscanbereconstructedfromhumanneuroimagingdatausingdeepneuralnetworkrepresentation
AT mitsuakitsukamoto naturalsoundscanbereconstructedfromhumanneuroimagingdatausingdeepneuralnetworkrepresentation
AT misatotanaka naturalsoundscanbereconstructedfromhumanneuroimagingdatausingdeepneuralnetworkrepresentation
AT yukiyasukamitani naturalsoundscanbereconstructedfromhumanneuroimagingdatausingdeepneuralnetworkrepresentation