An action decoding framework combined with deep neural network for predicting the semantics of human actions in videos from evoked brain activities

IntroductionRecently, numerous studies have focused on the semantic decoding of perceived images based on functional magnetic resonance imaging (fMRI) activities. However, it remains unclear whether it is possible to establish relationships between brain activities and semantic features of human act...

Full description

Saved in:
Bibliographic Details
Main Authors: Yuanyuan Zhang, Manli Tian, Baolin Liu
Format: Article
Language:English
Published: Frontiers Media S.A. 2025-02-01
Series:Frontiers in Neuroinformatics
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fninf.2025.1526259/full
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849716487411990528
author Yuanyuan Zhang
Manli Tian
Baolin Liu
author_facet Yuanyuan Zhang
Manli Tian
Baolin Liu
author_sort Yuanyuan Zhang
collection DOAJ
description IntroductionRecently, numerous studies have focused on the semantic decoding of perceived images based on functional magnetic resonance imaging (fMRI) activities. However, it remains unclear whether it is possible to establish relationships between brain activities and semantic features of human actions in video stimuli. Here we construct a framework for decoding action semantics by establishing relationships between brain activities and semantic features of human actions.MethodsTo effectively use a small amount of available brain activity data, our proposed method employs a pre-trained image action recognition network model based on an expanding three-dimensional (X3D) deep neural network framework (DNN). To apply brain activities to the image action recognition network, we train regression models that learn the relationship between brain activities and deep-layer image features. To improve decoding accuracy, we join by adding the nonlocal-attention mechanism module to the X3D model to capture long-range temporal and spatial dependence, proposing a multilayer perceptron (MLP) module of multi-task loss constraint to build a more accurate regression mapping approach and performing data enhancement through linear interpolation to expand the amount of data to reduce the impact of a small sample.Results and discussionOur findings indicate that the features in the X3D-DNN are biologically relevant, and capture information useful for perception. The proposed method enriches the semantic decoding model. We have also conducted several experiments with data from different subsets of brain regions known to process visual stimuli. The results suggest that semantic information for human actions is widespread across the entire visual cortex.
format Article
id doaj-art-6db79ea8334b442394386e6e00f8eee7
institution DOAJ
issn 1662-5196
language English
publishDate 2025-02-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Neuroinformatics
spelling doaj-art-6db79ea8334b442394386e6e00f8eee72025-08-20T03:12:58ZengFrontiers Media S.A.Frontiers in Neuroinformatics1662-51962025-02-011910.3389/fninf.2025.15262591526259An action decoding framework combined with deep neural network for predicting the semantics of human actions in videos from evoked brain activitiesYuanyuan ZhangManli TianBaolin LiuIntroductionRecently, numerous studies have focused on the semantic decoding of perceived images based on functional magnetic resonance imaging (fMRI) activities. However, it remains unclear whether it is possible to establish relationships between brain activities and semantic features of human actions in video stimuli. Here we construct a framework for decoding action semantics by establishing relationships between brain activities and semantic features of human actions.MethodsTo effectively use a small amount of available brain activity data, our proposed method employs a pre-trained image action recognition network model based on an expanding three-dimensional (X3D) deep neural network framework (DNN). To apply brain activities to the image action recognition network, we train regression models that learn the relationship between brain activities and deep-layer image features. To improve decoding accuracy, we join by adding the nonlocal-attention mechanism module to the X3D model to capture long-range temporal and spatial dependence, proposing a multilayer perceptron (MLP) module of multi-task loss constraint to build a more accurate regression mapping approach and performing data enhancement through linear interpolation to expand the amount of data to reduce the impact of a small sample.Results and discussionOur findings indicate that the features in the X3D-DNN are biologically relevant, and capture information useful for perception. The proposed method enriches the semantic decoding model. We have also conducted several experiments with data from different subsets of brain regions known to process visual stimuli. The results suggest that semantic information for human actions is widespread across the entire visual cortex.https://www.frontiersin.org/articles/10.3389/fninf.2025.1526259/fullfunctional magnetic resonance imagingdecodingaction semanticthree-dimension convolutional neural networkmulti-subject model
spellingShingle Yuanyuan Zhang
Manli Tian
Baolin Liu
An action decoding framework combined with deep neural network for predicting the semantics of human actions in videos from evoked brain activities
Frontiers in Neuroinformatics
functional magnetic resonance imaging
decoding
action semantic
three-dimension convolutional neural network
multi-subject model
title An action decoding framework combined with deep neural network for predicting the semantics of human actions in videos from evoked brain activities
title_full An action decoding framework combined with deep neural network for predicting the semantics of human actions in videos from evoked brain activities
title_fullStr An action decoding framework combined with deep neural network for predicting the semantics of human actions in videos from evoked brain activities
title_full_unstemmed An action decoding framework combined with deep neural network for predicting the semantics of human actions in videos from evoked brain activities
title_short An action decoding framework combined with deep neural network for predicting the semantics of human actions in videos from evoked brain activities
title_sort action decoding framework combined with deep neural network for predicting the semantics of human actions in videos from evoked brain activities
topic functional magnetic resonance imaging
decoding
action semantic
three-dimension convolutional neural network
multi-subject model
url https://www.frontiersin.org/articles/10.3389/fninf.2025.1526259/full
work_keys_str_mv AT yuanyuanzhang anactiondecodingframeworkcombinedwithdeepneuralnetworkforpredictingthesemanticsofhumanactionsinvideosfromevokedbrainactivities
AT manlitian anactiondecodingframeworkcombinedwithdeepneuralnetworkforpredictingthesemanticsofhumanactionsinvideosfromevokedbrainactivities
AT baolinliu anactiondecodingframeworkcombinedwithdeepneuralnetworkforpredictingthesemanticsofhumanactionsinvideosfromevokedbrainactivities
AT yuanyuanzhang actiondecodingframeworkcombinedwithdeepneuralnetworkforpredictingthesemanticsofhumanactionsinvideosfromevokedbrainactivities
AT manlitian actiondecodingframeworkcombinedwithdeepneuralnetworkforpredictingthesemanticsofhumanactionsinvideosfromevokedbrainactivities
AT baolinliu actiondecodingframeworkcombinedwithdeepneuralnetworkforpredictingthesemanticsofhumanactionsinvideosfromevokedbrainactivities