Natural sounds can be reconstructed from human neuroimaging data using deep neural network representation.
Reconstruction of perceptual experiences from brain activity offers a unique window into how population neural responses represent sensory information. Although decoding visual content from functional MRI (fMRI) has seen significant success, reconstructing arbitrary sounds remains challenging due to...
Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Public Library of Science (PLoS)
2025-07-01
|
| Series: | PLoS Biology |
| Online Access: | https://doi.org/10.1371/journal.pbio.3003293 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849240424572518400 |
|---|---|
| author | Jong-Yun Park Mitsuaki Tsukamoto Misato Tanaka Yukiyasu Kamitani |
| author_facet | Jong-Yun Park Mitsuaki Tsukamoto Misato Tanaka Yukiyasu Kamitani |
| author_sort | Jong-Yun Park |
| collection | DOAJ |
| description | Reconstruction of perceptual experiences from brain activity offers a unique window into how population neural responses represent sensory information. Although decoding visual content from functional MRI (fMRI) has seen significant success, reconstructing arbitrary sounds remains challenging due to the fine temporal structure of auditory signals and the coarse temporal resolution of fMRI. Drawing on the hierarchical auditory features of deep neural networks (DNNs) with progressively larger time windows and their neural activity correspondence, we introduce a method for sound reconstruction that integrates brain decoding of DNN features and an audio-generative model. DNN features decoded from auditory cortical activity outperformed spectrotemporal and modulation-based features, enabling perceptually plausible reconstructions across diverse sound categories. Behavioral evaluations and objective measures confirmed that these reconstructions preserved short-term spectral and perceptual properties, capturing the characteristic timbre of speech, animal calls, and musical instruments, while the reconstructed sounds did not reproduce longer temporal sequences with fidelity. Leave-category-out analyses indicated that the method generalizes across sound categories. Reconstructions at higher DNN layers and from early auditory regions revealed distinct contributions to decoding performance. Applying the model to a selective auditory attention ("cocktail party") task further showed that reconstructions reflected the attended sound more strongly than the unattended one in some of the subjects. Despite its inability to reconstruct exact temporal sequences, which may reflect the limited temporal resolution of fMRI, our framework demonstrates the feasibility of mapping brain activity to auditory experiences-a step toward more comprehensive understanding and reconstruction of internal auditory representations. |
| format | Article |
| id | doaj-art-4c68a97e22ef488298ef946b3341a0e0 |
| institution | Kabale University |
| issn | 1544-9173 1545-7885 |
| language | English |
| publishDate | 2025-07-01 |
| publisher | Public Library of Science (PLoS) |
| record_format | Article |
| series | PLoS Biology |
| spelling | doaj-art-4c68a97e22ef488298ef946b3341a0e02025-08-20T04:00:34ZengPublic Library of Science (PLoS)PLoS Biology1544-91731545-78852025-07-01237e300329310.1371/journal.pbio.3003293Natural sounds can be reconstructed from human neuroimaging data using deep neural network representation.Jong-Yun ParkMitsuaki TsukamotoMisato TanakaYukiyasu KamitaniReconstruction of perceptual experiences from brain activity offers a unique window into how population neural responses represent sensory information. Although decoding visual content from functional MRI (fMRI) has seen significant success, reconstructing arbitrary sounds remains challenging due to the fine temporal structure of auditory signals and the coarse temporal resolution of fMRI. Drawing on the hierarchical auditory features of deep neural networks (DNNs) with progressively larger time windows and their neural activity correspondence, we introduce a method for sound reconstruction that integrates brain decoding of DNN features and an audio-generative model. DNN features decoded from auditory cortical activity outperformed spectrotemporal and modulation-based features, enabling perceptually plausible reconstructions across diverse sound categories. Behavioral evaluations and objective measures confirmed that these reconstructions preserved short-term spectral and perceptual properties, capturing the characteristic timbre of speech, animal calls, and musical instruments, while the reconstructed sounds did not reproduce longer temporal sequences with fidelity. Leave-category-out analyses indicated that the method generalizes across sound categories. Reconstructions at higher DNN layers and from early auditory regions revealed distinct contributions to decoding performance. Applying the model to a selective auditory attention ("cocktail party") task further showed that reconstructions reflected the attended sound more strongly than the unattended one in some of the subjects. Despite its inability to reconstruct exact temporal sequences, which may reflect the limited temporal resolution of fMRI, our framework demonstrates the feasibility of mapping brain activity to auditory experiences-a step toward more comprehensive understanding and reconstruction of internal auditory representations.https://doi.org/10.1371/journal.pbio.3003293 |
| spellingShingle | Jong-Yun Park Mitsuaki Tsukamoto Misato Tanaka Yukiyasu Kamitani Natural sounds can be reconstructed from human neuroimaging data using deep neural network representation. PLoS Biology |
| title | Natural sounds can be reconstructed from human neuroimaging data using deep neural network representation. |
| title_full | Natural sounds can be reconstructed from human neuroimaging data using deep neural network representation. |
| title_fullStr | Natural sounds can be reconstructed from human neuroimaging data using deep neural network representation. |
| title_full_unstemmed | Natural sounds can be reconstructed from human neuroimaging data using deep neural network representation. |
| title_short | Natural sounds can be reconstructed from human neuroimaging data using deep neural network representation. |
| title_sort | natural sounds can be reconstructed from human neuroimaging data using deep neural network representation |
| url | https://doi.org/10.1371/journal.pbio.3003293 |
| work_keys_str_mv | AT jongyunpark naturalsoundscanbereconstructedfromhumanneuroimagingdatausingdeepneuralnetworkrepresentation AT mitsuakitsukamoto naturalsoundscanbereconstructedfromhumanneuroimagingdatausingdeepneuralnetworkrepresentation AT misatotanaka naturalsoundscanbereconstructedfromhumanneuroimagingdatausingdeepneuralnetworkrepresentation AT yukiyasukamitani naturalsoundscanbereconstructedfromhumanneuroimagingdatausingdeepneuralnetworkrepresentation |