Fourier or Wavelet bases as counterpart self-attention in spikformer for efficient visual classification

Energy-efficient spikformer has been proposed by integrating the biologically plausible spiking neural network (SNN) and artificial transformer, whereby the spiking self-attention (SSA) is used to achieve both higher accuracy and lower computational cost. However, it seems that self-attention is not...

Full description

Saved in:

Bibliographic Details
Main Authors:	Qingyu Wang, Duzhen Zhang, Xinyuan Cai, Tielin Zhang, Bo Xu
Format:	Article
Language:	English
Published:	Frontiers Media S.A. 2025-01-01
Series:	Frontiers in Neuroscience
Subjects:	spiking neural network transformer Fourier/Wavelet transform visual classification computational efficiency
Online Access:	https://www.frontiersin.org/articles/10.3389/fnins.2024.1516868/full
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1832582855929102336
author	Qingyu Wang Qingyu Wang Duzhen Zhang Xinyuan Cai Tielin Zhang Bo Xu
author_facet	Qingyu Wang Qingyu Wang Duzhen Zhang Xinyuan Cai Tielin Zhang Bo Xu
author_sort	Qingyu Wang
collection	DOAJ
description	Energy-efficient spikformer has been proposed by integrating the biologically plausible spiking neural network (SNN) and artificial transformer, whereby the spiking self-attention (SSA) is used to achieve both higher accuracy and lower computational cost. However, it seems that self-attention is not always necessary, especially in sparse spike-form calculation manners. In this article, we innovatively replace vanilla SSA (using dynamic bases calculating from Query and Key) with spike-form Fourier transform, wavelet transform, and their combinations (using fixed triangular or wavelets bases), based on a key hypothesis that both of them use a set of basis functions for information transformation. Hence, the Fourier-or-Wavelet-based spikformer (FWformer) is proposed and verified in visual classification tasks, including both static image and event-based video datasets. The FWformer can achieve comparable or even higher accuracies (0.4%–1.5%), higher running speed (9%–51% for training and 19%–70% for inference), reduced theoretical energy consumption (20%–25%), and reduced graphic processing unit (GPU) memory usage (4%–26%), compared to the standard spikformer. Our result indicates the continuous refinement of new transformers that are inspired either by biological discovery (spike-form), or information theory (Fourier or Wavelet transform), is promising.
format	Article
id	doaj-art-1b87288d6bd247bbbf33061fb3404d81
institution	Kabale University
issn	1662-453X
language	English
publishDate	2025-01-01
publisher	Frontiers Media S.A.
record_format	Article
series	Frontiers in Neuroscience
spelling	doaj-art-1b87288d6bd247bbbf33061fb3404d812025-01-29T06:46:15ZengFrontiers Media S.A.Frontiers in Neuroscience1662-453X2025-01-011810.3389/fnins.2024.15168681516868Fourier or Wavelet bases as counterpart self-attention in spikformer for efficient visual classificationQingyu Wang0Qingyu Wang1Duzhen Zhang2Xinyuan Cai3Tielin Zhang4Bo Xu5Institute of Automation, Chinese Academy of Sciences, Beijing, ChinaSchool of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, ChinaInstitute of Automation, Chinese Academy of Sciences, Beijing, ChinaInstitute of Automation, Chinese Academy of Sciences, Beijing, ChinaCenter for Excellence in Brain Science and Intelligence Technology, Chinese Academy of Sciences, Shanghai, ChinaInstitute of Automation, Chinese Academy of Sciences, Beijing, ChinaEnergy-efficient spikformer has been proposed by integrating the biologically plausible spiking neural network (SNN) and artificial transformer, whereby the spiking self-attention (SSA) is used to achieve both higher accuracy and lower computational cost. However, it seems that self-attention is not always necessary, especially in sparse spike-form calculation manners. In this article, we innovatively replace vanilla SSA (using dynamic bases calculating from Query and Key) with spike-form Fourier transform, wavelet transform, and their combinations (using fixed triangular or wavelets bases), based on a key hypothesis that both of them use a set of basis functions for information transformation. Hence, the Fourier-or-Wavelet-based spikformer (FWformer) is proposed and verified in visual classification tasks, including both static image and event-based video datasets. The FWformer can achieve comparable or even higher accuracies (0.4%–1.5%), higher running speed (9%–51% for training and 19%–70% for inference), reduced theoretical energy consumption (20%–25%), and reduced graphic processing unit (GPU) memory usage (4%–26%), compared to the standard spikformer. Our result indicates the continuous refinement of new transformers that are inspired either by biological discovery (spike-form), or information theory (Fourier or Wavelet transform), is promising.https://www.frontiersin.org/articles/10.3389/fnins.2024.1516868/fullspiking neural networktransformerFourier/Wavelet transformvisual classificationcomputational efficiency
spellingShingle	Qingyu Wang Qingyu Wang Duzhen Zhang Xinyuan Cai Tielin Zhang Bo Xu Fourier or Wavelet bases as counterpart self-attention in spikformer for efficient visual classification Frontiers in Neuroscience spiking neural network transformer Fourier/Wavelet transform visual classification computational efficiency
title	Fourier or Wavelet bases as counterpart self-attention in spikformer for efficient visual classification
title_full	Fourier or Wavelet bases as counterpart self-attention in spikformer for efficient visual classification
title_fullStr	Fourier or Wavelet bases as counterpart self-attention in spikformer for efficient visual classification
title_full_unstemmed	Fourier or Wavelet bases as counterpart self-attention in spikformer for efficient visual classification
title_short	Fourier or Wavelet bases as counterpart self-attention in spikformer for efficient visual classification
title_sort	fourier or wavelet bases as counterpart self attention in spikformer for efficient visual classification
topic	spiking neural network transformer Fourier/Wavelet transform visual classification computational efficiency
url	https://www.frontiersin.org/articles/10.3389/fnins.2024.1516868/full
work_keys_str_mv	AT qingyuwang fourierorwaveletbasesascounterpartselfattentioninspikformerforefficientvisualclassification AT qingyuwang fourierorwaveletbasesascounterpartselfattentioninspikformerforefficientvisualclassification AT duzhenzhang fourierorwaveletbasesascounterpartselfattentioninspikformerforefficientvisualclassification AT xinyuancai fourierorwaveletbasesascounterpartselfattentioninspikformerforefficientvisualclassification AT tielinzhang fourierorwaveletbasesascounterpartselfattentioninspikformerforefficientvisualclassification AT boxu fourierorwaveletbasesascounterpartselfattentioninspikformerforefficientvisualclassification

Fourier or Wavelet bases as counterpart self-attention in spikformer for efficient visual classification

Similar Items