Appropriate data segmentation improves speech encoding models: Analysis and simulation of electrophysiological recordings.
<h4>Background</h4>In recent decades, studies modeling the neural processing of continuous, naturalistic, speech provided new insights into how speech and language are represented in the brain. However, the linear encoder models commonly used in such studies assume that the underlying da...
Saved in:
| Main Authors: | , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Public Library of Science (PLoS)
2025-01-01
|
| Series: | PLoS ONE |
| Online Access: | https://doi.org/10.1371/journal.pone.0323276 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850124878988967936 |
|---|---|
| author | Ole Bialas Edmund C Lalor |
| author_facet | Ole Bialas Edmund C Lalor |
| author_sort | Ole Bialas |
| collection | DOAJ |
| description | <h4>Background</h4>In recent decades, studies modeling the neural processing of continuous, naturalistic, speech provided new insights into how speech and language are represented in the brain. However, the linear encoder models commonly used in such studies assume that the underlying data are stationary, varying to a fixed degree around a constant mean. Long, continuous, neural recordings may violate this assumption leading to impaired model performance. We aimed to examine the effect of non-stationary trends in continuous neural recordings on the performance of linear speech encoding models.<h4>Methods</h4>We used temporal response functions (TRFs) to predict continuous neural responses to speech while splitting the data into segments of varying length, prior to model fitting. Our Hypothesis was that if the data were non-stationary, segmentation should improve model performance by making individual segments approximately stationary. We simulated and predicted stationary and non-stationary recordings to test our hypothesis under a known ground truth and predicted the brain activity of participants who listened to a narrated story, to test our hypothesis on actual neural recordings.<h4>Results</h4>Simulations showed that, for stationary data, increasing segmentation steadily decreased model performance. For non-stationary data however, segmentation initially improved model performance. Modeling of neural recordings yielded similar results: segments of intermediate length (5-15 s) led to improved model performance compared to very short (1-2 s) and very long (30-120 s) segments.<h4>Conclusions</h4>We showed that data segmentation improves the performance of encoding models for both simulated and real neural data and that this can be explained by the fact that shorter segments approximate stationarity more closely. Thus, the common practice of applying encoding models to long continuous segments of data is suboptimal and recordings should be segmented prior to modeling. |
| format | Article |
| id | doaj-art-df1db4d9da8e40a286e85b2d9a29d365 |
| institution | OA Journals |
| issn | 1932-6203 |
| language | English |
| publishDate | 2025-01-01 |
| publisher | Public Library of Science (PLoS) |
| record_format | Article |
| series | PLoS ONE |
| spelling | doaj-art-df1db4d9da8e40a286e85b2d9a29d3652025-08-20T02:34:13ZengPublic Library of Science (PLoS)PLoS ONE1932-62032025-01-01205e032327610.1371/journal.pone.0323276Appropriate data segmentation improves speech encoding models: Analysis and simulation of electrophysiological recordings.Ole BialasEdmund C Lalor<h4>Background</h4>In recent decades, studies modeling the neural processing of continuous, naturalistic, speech provided new insights into how speech and language are represented in the brain. However, the linear encoder models commonly used in such studies assume that the underlying data are stationary, varying to a fixed degree around a constant mean. Long, continuous, neural recordings may violate this assumption leading to impaired model performance. We aimed to examine the effect of non-stationary trends in continuous neural recordings on the performance of linear speech encoding models.<h4>Methods</h4>We used temporal response functions (TRFs) to predict continuous neural responses to speech while splitting the data into segments of varying length, prior to model fitting. Our Hypothesis was that if the data were non-stationary, segmentation should improve model performance by making individual segments approximately stationary. We simulated and predicted stationary and non-stationary recordings to test our hypothesis under a known ground truth and predicted the brain activity of participants who listened to a narrated story, to test our hypothesis on actual neural recordings.<h4>Results</h4>Simulations showed that, for stationary data, increasing segmentation steadily decreased model performance. For non-stationary data however, segmentation initially improved model performance. Modeling of neural recordings yielded similar results: segments of intermediate length (5-15 s) led to improved model performance compared to very short (1-2 s) and very long (30-120 s) segments.<h4>Conclusions</h4>We showed that data segmentation improves the performance of encoding models for both simulated and real neural data and that this can be explained by the fact that shorter segments approximate stationarity more closely. Thus, the common practice of applying encoding models to long continuous segments of data is suboptimal and recordings should be segmented prior to modeling.https://doi.org/10.1371/journal.pone.0323276 |
| spellingShingle | Ole Bialas Edmund C Lalor Appropriate data segmentation improves speech encoding models: Analysis and simulation of electrophysiological recordings. PLoS ONE |
| title | Appropriate data segmentation improves speech encoding models: Analysis and simulation of electrophysiological recordings. |
| title_full | Appropriate data segmentation improves speech encoding models: Analysis and simulation of electrophysiological recordings. |
| title_fullStr | Appropriate data segmentation improves speech encoding models: Analysis and simulation of electrophysiological recordings. |
| title_full_unstemmed | Appropriate data segmentation improves speech encoding models: Analysis and simulation of electrophysiological recordings. |
| title_short | Appropriate data segmentation improves speech encoding models: Analysis and simulation of electrophysiological recordings. |
| title_sort | appropriate data segmentation improves speech encoding models analysis and simulation of electrophysiological recordings |
| url | https://doi.org/10.1371/journal.pone.0323276 |
| work_keys_str_mv | AT olebialas appropriatedatasegmentationimprovesspeechencodingmodelsanalysisandsimulationofelectrophysiologicalrecordings AT edmundclalor appropriatedatasegmentationimprovesspeechencodingmodelsanalysisandsimulationofelectrophysiologicalrecordings |