From pronounced to imagined: improving speech decoding with multi-condition EEG data

IntroductionImagined speech decoding using EEG holds promising applications for individuals with motor neuron diseases, although its performance remains limited due to small dataset sizes and the absence of sensory feedback. Here, we investigated whether incorporating EEG data from overt (pronounced...

Full description

Saved in:
Bibliographic Details
Main Authors: Denise Alonso-Vázquez, Omar Mendoza-Montoya, Ricardo Caraza, Hector R. Martinez, Javier M. Antelis
Format: Article
Language:English
Published: Frontiers Media S.A. 2025-06-01
Series:Frontiers in Neuroinformatics
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fninf.2025.1583428/full
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849471616254214144
author Denise Alonso-Vázquez
Omar Mendoza-Montoya
Ricardo Caraza
Hector R. Martinez
Javier M. Antelis
author_facet Denise Alonso-Vázquez
Omar Mendoza-Montoya
Ricardo Caraza
Hector R. Martinez
Javier M. Antelis
author_sort Denise Alonso-Vázquez
collection DOAJ
description IntroductionImagined speech decoding using EEG holds promising applications for individuals with motor neuron diseases, although its performance remains limited due to small dataset sizes and the absence of sensory feedback. Here, we investigated whether incorporating EEG data from overt (pronounced) speech could enhance imagined speech classification.MethodsOur approach systematically compares four classification scenarios by modifying the training dataset: intra-subject (using only imagined speech, combining overt and imagined speech, and using only overt speech) and multi-subject (combining overt speech data from different participants with the imagined speech of the target participant). We implemented all scenarios using the convolutional neural network EEGNet. To this end, twenty-four healthy participants pronounced and imagined five Spanish words.ResultsIn binary word-pair classifications, combining overt and imagined speech data in the intra-subject scenario led to accuracy improvements of 3%–5.17% in four out of 10 word pairs, compared to training with imagined speech only. Although the highest individual accuracy (95%) was achieved with imagined speech alone, the inclusion of overt speech data allowed more participants to surpass 70% accuracy, increasing from 10 (imagined only) to 15 participants. In the intra-subject multi-class scenario, combining overt and imagined speech did not yield statistically significant improvements over using imagined speech exclusively.DiscussionFinally, we observed that features such as word length, phonological complexity, and frequency of use contributed to higher discriminability between certain imagined word pairs. These findings suggest that incorporating overt speech data can improve imagined speech decoding in individualized models, offering a feasible strategy to support the early adoption of brain-computer interfaces before speech deterioration occurs in individuals with motor neuron diseases.
format Article
id doaj-art-5fec34d1c7924dc2a3cb0b5398a8d4a6
institution Kabale University
issn 1662-5196
language English
publishDate 2025-06-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Neuroinformatics
spelling doaj-art-5fec34d1c7924dc2a3cb0b5398a8d4a62025-08-20T03:24:47ZengFrontiers Media S.A.Frontiers in Neuroinformatics1662-51962025-06-011910.3389/fninf.2025.15834281583428From pronounced to imagined: improving speech decoding with multi-condition EEG dataDenise Alonso-Vázquez0Omar Mendoza-Montoya1Ricardo Caraza2Hector R. Martinez3Javier M. Antelis4Escuela de Ingeniería y Ciencias, Tecnologico de Monterrey, Monterrey, MexicoEscuela de Ingeniería y Ciencias, Tecnologico de Monterrey, Monterrey, MexicoEscuela de Medicina y Ciencias de la Salud, Tecnologico de Monterrey, Monterrey, MexicoEscuela de Medicina y Ciencias de la Salud, Tecnologico de Monterrey, Monterrey, MexicoEscuela de Ingeniería y Ciencias, Tecnologico de Monterrey, Monterrey, MexicoIntroductionImagined speech decoding using EEG holds promising applications for individuals with motor neuron diseases, although its performance remains limited due to small dataset sizes and the absence of sensory feedback. Here, we investigated whether incorporating EEG data from overt (pronounced) speech could enhance imagined speech classification.MethodsOur approach systematically compares four classification scenarios by modifying the training dataset: intra-subject (using only imagined speech, combining overt and imagined speech, and using only overt speech) and multi-subject (combining overt speech data from different participants with the imagined speech of the target participant). We implemented all scenarios using the convolutional neural network EEGNet. To this end, twenty-four healthy participants pronounced and imagined five Spanish words.ResultsIn binary word-pair classifications, combining overt and imagined speech data in the intra-subject scenario led to accuracy improvements of 3%–5.17% in four out of 10 word pairs, compared to training with imagined speech only. Although the highest individual accuracy (95%) was achieved with imagined speech alone, the inclusion of overt speech data allowed more participants to surpass 70% accuracy, increasing from 10 (imagined only) to 15 participants. In the intra-subject multi-class scenario, combining overt and imagined speech did not yield statistically significant improvements over using imagined speech exclusively.DiscussionFinally, we observed that features such as word length, phonological complexity, and frequency of use contributed to higher discriminability between certain imagined word pairs. These findings suggest that incorporating overt speech data can improve imagined speech decoding in individualized models, offering a feasible strategy to support the early adoption of brain-computer interfaces before speech deterioration occurs in individuals with motor neuron diseases.https://www.frontiersin.org/articles/10.3389/fninf.2025.1583428/fullimagined speech classificationEEG-based classificationovert speechEEGNETbrain-computer interfaces
spellingShingle Denise Alonso-Vázquez
Omar Mendoza-Montoya
Ricardo Caraza
Hector R. Martinez
Javier M. Antelis
From pronounced to imagined: improving speech decoding with multi-condition EEG data
Frontiers in Neuroinformatics
imagined speech classification
EEG-based classification
overt speech
EEGNET
brain-computer interfaces
title From pronounced to imagined: improving speech decoding with multi-condition EEG data
title_full From pronounced to imagined: improving speech decoding with multi-condition EEG data
title_fullStr From pronounced to imagined: improving speech decoding with multi-condition EEG data
title_full_unstemmed From pronounced to imagined: improving speech decoding with multi-condition EEG data
title_short From pronounced to imagined: improving speech decoding with multi-condition EEG data
title_sort from pronounced to imagined improving speech decoding with multi condition eeg data
topic imagined speech classification
EEG-based classification
overt speech
EEGNET
brain-computer interfaces
url https://www.frontiersin.org/articles/10.3389/fninf.2025.1583428/full
work_keys_str_mv AT denisealonsovazquez frompronouncedtoimaginedimprovingspeechdecodingwithmulticonditioneegdata
AT omarmendozamontoya frompronouncedtoimaginedimprovingspeechdecodingwithmulticonditioneegdata
AT ricardocaraza frompronouncedtoimaginedimprovingspeechdecodingwithmulticonditioneegdata
AT hectorrmartinez frompronouncedtoimaginedimprovingspeechdecodingwithmulticonditioneegdata
AT javiermantelis frompronouncedtoimaginedimprovingspeechdecodingwithmulticonditioneegdata