Recurrent neural networks as neuro-computational models of human speech recognition.
Human speech recognition transforms a continuous acoustic signal into categorical linguistic units, by aggregating information that is distributed in time. It has been suggested that this kind of information processing may be understood through the computations of a Recurrent Neural Network (RNN) th...
Saved in:
| Main Authors: | , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Public Library of Science (PLoS)
2025-07-01
|
| Series: | PLoS Computational Biology |
| Online Access: | https://doi.org/10.1371/journal.pcbi.1013244 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849228221214621696 |
|---|---|
| author | Christian Brodbeck Thomas Hannagan James S Magnuson |
| author_facet | Christian Brodbeck Thomas Hannagan James S Magnuson |
| author_sort | Christian Brodbeck |
| collection | DOAJ |
| description | Human speech recognition transforms a continuous acoustic signal into categorical linguistic units, by aggregating information that is distributed in time. It has been suggested that this kind of information processing may be understood through the computations of a Recurrent Neural Network (RNN) that receives input frame by frame, linearly in time, but builds an incremental representation of this input through a continually evolving internal state. While RNNs can simulate several key behavioral observations about human speech and language processing, it is unknown whether RNNs also develop computational dynamics that resemble human neural speech processing. Here we show that the internal dynamics of long short-term memory (LSTM) RNNs, trained to recognize speech from auditory spectrograms, predict human neural population responses to the same stimuli, beyond predictions from auditory features. Variations in the RNN architecture motivated by cognitive principles further improved this predictive power. Specifically, modifications that allow more human-like phonetic competition also led to more human-like temporal dynamics. Overall, our results suggest that RNNs provide plausible computational models of the cortical processes supporting human speech recognition. |
| format | Article |
| id | doaj-art-54e6c0b6ab154aeba983a2b4e0af9f4a |
| institution | Kabale University |
| issn | 1553-734X 1553-7358 |
| language | English |
| publishDate | 2025-07-01 |
| publisher | Public Library of Science (PLoS) |
| record_format | Article |
| series | PLoS Computational Biology |
| spelling | doaj-art-54e6c0b6ab154aeba983a2b4e0af9f4a2025-08-23T05:31:13ZengPublic Library of Science (PLoS)PLoS Computational Biology1553-734X1553-73582025-07-01217e101324410.1371/journal.pcbi.1013244Recurrent neural networks as neuro-computational models of human speech recognition.Christian BrodbeckThomas HannaganJames S MagnusonHuman speech recognition transforms a continuous acoustic signal into categorical linguistic units, by aggregating information that is distributed in time. It has been suggested that this kind of information processing may be understood through the computations of a Recurrent Neural Network (RNN) that receives input frame by frame, linearly in time, but builds an incremental representation of this input through a continually evolving internal state. While RNNs can simulate several key behavioral observations about human speech and language processing, it is unknown whether RNNs also develop computational dynamics that resemble human neural speech processing. Here we show that the internal dynamics of long short-term memory (LSTM) RNNs, trained to recognize speech from auditory spectrograms, predict human neural population responses to the same stimuli, beyond predictions from auditory features. Variations in the RNN architecture motivated by cognitive principles further improved this predictive power. Specifically, modifications that allow more human-like phonetic competition also led to more human-like temporal dynamics. Overall, our results suggest that RNNs provide plausible computational models of the cortical processes supporting human speech recognition.https://doi.org/10.1371/journal.pcbi.1013244 |
| spellingShingle | Christian Brodbeck Thomas Hannagan James S Magnuson Recurrent neural networks as neuro-computational models of human speech recognition. PLoS Computational Biology |
| title | Recurrent neural networks as neuro-computational models of human speech recognition. |
| title_full | Recurrent neural networks as neuro-computational models of human speech recognition. |
| title_fullStr | Recurrent neural networks as neuro-computational models of human speech recognition. |
| title_full_unstemmed | Recurrent neural networks as neuro-computational models of human speech recognition. |
| title_short | Recurrent neural networks as neuro-computational models of human speech recognition. |
| title_sort | recurrent neural networks as neuro computational models of human speech recognition |
| url | https://doi.org/10.1371/journal.pcbi.1013244 |
| work_keys_str_mv | AT christianbrodbeck recurrentneuralnetworksasneurocomputationalmodelsofhumanspeechrecognition AT thomashannagan recurrentneuralnetworksasneurocomputationalmodelsofhumanspeechrecognition AT jamessmagnuson recurrentneuralnetworksasneurocomputationalmodelsofhumanspeechrecognition |