An action decoding framework combined with deep neural network for predicting the semantics of human actions in videos from evoked brain activities

IntroductionRecently, numerous studies have focused on the semantic decoding of perceived images based on functional magnetic resonance imaging (fMRI) activities. However, it remains unclear whether it is possible to establish relationships between brain activities and semantic features of human act...

Full description

Saved in:

Bibliographic Details
Main Authors:	Yuanyuan Zhang, Manli Tian, Baolin Liu
Format:	Article
Language:	English
Published:	Frontiers Media S.A. 2025-02-01
Series:	Frontiers in Neuroinformatics
Subjects:	functional magnetic resonance imaging decoding action semantic three-dimension convolutional neural network multi-subject model
Online Access:	https://www.frontiersin.org/articles/10.3389/fninf.2025.1526259/full
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1849716487411990528
author	Yuanyuan Zhang Manli Tian Baolin Liu
author_facet	Yuanyuan Zhang Manli Tian Baolin Liu
author_sort	Yuanyuan Zhang
collection	DOAJ
description	IntroductionRecently, numerous studies have focused on the semantic decoding of perceived images based on functional magnetic resonance imaging (fMRI) activities. However, it remains unclear whether it is possible to establish relationships between brain activities and semantic features of human actions in video stimuli. Here we construct a framework for decoding action semantics by establishing relationships between brain activities and semantic features of human actions.MethodsTo effectively use a small amount of available brain activity data, our proposed method employs a pre-trained image action recognition network model based on an expanding three-dimensional (X3D) deep neural network framework (DNN). To apply brain activities to the image action recognition network, we train regression models that learn the relationship between brain activities and deep-layer image features. To improve decoding accuracy, we join by adding the nonlocal-attention mechanism module to the X3D model to capture long-range temporal and spatial dependence, proposing a multilayer perceptron (MLP) module of multi-task loss constraint to build a more accurate regression mapping approach and performing data enhancement through linear interpolation to expand the amount of data to reduce the impact of a small sample.Results and discussionOur findings indicate that the features in the X3D-DNN are biologically relevant, and capture information useful for perception. The proposed method enriches the semantic decoding model. We have also conducted several experiments with data from different subsets of brain regions known to process visual stimuli. The results suggest that semantic information for human actions is widespread across the entire visual cortex.
format	Article
id	doaj-art-6db79ea8334b442394386e6e00f8eee7
institution	DOAJ
issn	1662-5196
language	English
publishDate	2025-02-01
publisher	Frontiers Media S.A.
record_format	Article
series	Frontiers in Neuroinformatics
spelling	doaj-art-6db79ea8334b442394386e6e00f8eee72025-08-20T03:12:58ZengFrontiers Media S.A.Frontiers in Neuroinformatics1662-51962025-02-011910.3389/fninf.2025.15262591526259An action decoding framework combined with deep neural network for predicting the semantics of human actions in videos from evoked brain activitiesYuanyuan ZhangManli TianBaolin LiuIntroductionRecently, numerous studies have focused on the semantic decoding of perceived images based on functional magnetic resonance imaging (fMRI) activities. However, it remains unclear whether it is possible to establish relationships between brain activities and semantic features of human actions in video stimuli. Here we construct a framework for decoding action semantics by establishing relationships between brain activities and semantic features of human actions.MethodsTo effectively use a small amount of available brain activity data, our proposed method employs a pre-trained image action recognition network model based on an expanding three-dimensional (X3D) deep neural network framework (DNN). To apply brain activities to the image action recognition network, we train regression models that learn the relationship between brain activities and deep-layer image features. To improve decoding accuracy, we join by adding the nonlocal-attention mechanism module to the X3D model to capture long-range temporal and spatial dependence, proposing a multilayer perceptron (MLP) module of multi-task loss constraint to build a more accurate regression mapping approach and performing data enhancement through linear interpolation to expand the amount of data to reduce the impact of a small sample.Results and discussionOur findings indicate that the features in the X3D-DNN are biologically relevant, and capture information useful for perception. The proposed method enriches the semantic decoding model. We have also conducted several experiments with data from different subsets of brain regions known to process visual stimuli. The results suggest that semantic information for human actions is widespread across the entire visual cortex.https://www.frontiersin.org/articles/10.3389/fninf.2025.1526259/fullfunctional magnetic resonance imagingdecodingaction semanticthree-dimension convolutional neural networkmulti-subject model
spellingShingle	Yuanyuan Zhang Manli Tian Baolin Liu An action decoding framework combined with deep neural network for predicting the semantics of human actions in videos from evoked brain activities Frontiers in Neuroinformatics functional magnetic resonance imaging decoding action semantic three-dimension convolutional neural network multi-subject model
title	An action decoding framework combined with deep neural network for predicting the semantics of human actions in videos from evoked brain activities
title_full	An action decoding framework combined with deep neural network for predicting the semantics of human actions in videos from evoked brain activities
title_fullStr	An action decoding framework combined with deep neural network for predicting the semantics of human actions in videos from evoked brain activities
title_full_unstemmed	An action decoding framework combined with deep neural network for predicting the semantics of human actions in videos from evoked brain activities
title_short	An action decoding framework combined with deep neural network for predicting the semantics of human actions in videos from evoked brain activities
title_sort	action decoding framework combined with deep neural network for predicting the semantics of human actions in videos from evoked brain activities
topic	functional magnetic resonance imaging decoding action semantic three-dimension convolutional neural network multi-subject model
url	https://www.frontiersin.org/articles/10.3389/fninf.2025.1526259/full
work_keys_str_mv	AT yuanyuanzhang anactiondecodingframeworkcombinedwithdeepneuralnetworkforpredictingthesemanticsofhumanactionsinvideosfromevokedbrainactivities AT manlitian anactiondecodingframeworkcombinedwithdeepneuralnetworkforpredictingthesemanticsofhumanactionsinvideosfromevokedbrainactivities AT baolinliu anactiondecodingframeworkcombinedwithdeepneuralnetworkforpredictingthesemanticsofhumanactionsinvideosfromevokedbrainactivities AT yuanyuanzhang actiondecodingframeworkcombinedwithdeepneuralnetworkforpredictingthesemanticsofhumanactionsinvideosfromevokedbrainactivities AT manlitian actiondecodingframeworkcombinedwithdeepneuralnetworkforpredictingthesemanticsofhumanactionsinvideosfromevokedbrainactivities AT baolinliu actiondecodingframeworkcombinedwithdeepneuralnetworkforpredictingthesemanticsofhumanactionsinvideosfromevokedbrainactivities

An action decoding framework combined with deep neural network for predicting the semantics of human actions in videos from evoked brain activities

Similar Items