Fourier or Wavelet bases as counterpart self-attention in spikformer for efficient visual classification

Energy-efficient spikformer has been proposed by integrating the biologically plausible spiking neural network (SNN) and artificial transformer, whereby the spiking self-attention (SSA) is used to achieve both higher accuracy and lower computational cost. However, it seems that self-attention is not...

Full description

Saved in:
Bibliographic Details
Main Authors: Qingyu Wang, Duzhen Zhang, Xinyuan Cai, Tielin Zhang, Bo Xu
Format: Article
Language:English
Published: Frontiers Media S.A. 2025-01-01
Series:Frontiers in Neuroscience
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fnins.2024.1516868/full
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832582855929102336
author Qingyu Wang
Qingyu Wang
Duzhen Zhang
Xinyuan Cai
Tielin Zhang
Bo Xu
author_facet Qingyu Wang
Qingyu Wang
Duzhen Zhang
Xinyuan Cai
Tielin Zhang
Bo Xu
author_sort Qingyu Wang
collection DOAJ
description Energy-efficient spikformer has been proposed by integrating the biologically plausible spiking neural network (SNN) and artificial transformer, whereby the spiking self-attention (SSA) is used to achieve both higher accuracy and lower computational cost. However, it seems that self-attention is not always necessary, especially in sparse spike-form calculation manners. In this article, we innovatively replace vanilla SSA (using dynamic bases calculating from Query and Key) with spike-form Fourier transform, wavelet transform, and their combinations (using fixed triangular or wavelets bases), based on a key hypothesis that both of them use a set of basis functions for information transformation. Hence, the Fourier-or-Wavelet-based spikformer (FWformer) is proposed and verified in visual classification tasks, including both static image and event-based video datasets. The FWformer can achieve comparable or even higher accuracies (0.4%–1.5%), higher running speed (9%–51% for training and 19%–70% for inference), reduced theoretical energy consumption (20%–25%), and reduced graphic processing unit (GPU) memory usage (4%–26%), compared to the standard spikformer. Our result indicates the continuous refinement of new transformers that are inspired either by biological discovery (spike-form), or information theory (Fourier or Wavelet transform), is promising.
format Article
id doaj-art-1b87288d6bd247bbbf33061fb3404d81
institution Kabale University
issn 1662-453X
language English
publishDate 2025-01-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Neuroscience
spelling doaj-art-1b87288d6bd247bbbf33061fb3404d812025-01-29T06:46:15ZengFrontiers Media S.A.Frontiers in Neuroscience1662-453X2025-01-011810.3389/fnins.2024.15168681516868Fourier or Wavelet bases as counterpart self-attention in spikformer for efficient visual classificationQingyu Wang0Qingyu Wang1Duzhen Zhang2Xinyuan Cai3Tielin Zhang4Bo Xu5Institute of Automation, Chinese Academy of Sciences, Beijing, ChinaSchool of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, ChinaInstitute of Automation, Chinese Academy of Sciences, Beijing, ChinaInstitute of Automation, Chinese Academy of Sciences, Beijing, ChinaCenter for Excellence in Brain Science and Intelligence Technology, Chinese Academy of Sciences, Shanghai, ChinaInstitute of Automation, Chinese Academy of Sciences, Beijing, ChinaEnergy-efficient spikformer has been proposed by integrating the biologically plausible spiking neural network (SNN) and artificial transformer, whereby the spiking self-attention (SSA) is used to achieve both higher accuracy and lower computational cost. However, it seems that self-attention is not always necessary, especially in sparse spike-form calculation manners. In this article, we innovatively replace vanilla SSA (using dynamic bases calculating from Query and Key) with spike-form Fourier transform, wavelet transform, and their combinations (using fixed triangular or wavelets bases), based on a key hypothesis that both of them use a set of basis functions for information transformation. Hence, the Fourier-or-Wavelet-based spikformer (FWformer) is proposed and verified in visual classification tasks, including both static image and event-based video datasets. The FWformer can achieve comparable or even higher accuracies (0.4%–1.5%), higher running speed (9%–51% for training and 19%–70% for inference), reduced theoretical energy consumption (20%–25%), and reduced graphic processing unit (GPU) memory usage (4%–26%), compared to the standard spikformer. Our result indicates the continuous refinement of new transformers that are inspired either by biological discovery (spike-form), or information theory (Fourier or Wavelet transform), is promising.https://www.frontiersin.org/articles/10.3389/fnins.2024.1516868/fullspiking neural networktransformerFourier/Wavelet transformvisual classificationcomputational efficiency
spellingShingle Qingyu Wang
Qingyu Wang
Duzhen Zhang
Xinyuan Cai
Tielin Zhang
Bo Xu
Fourier or Wavelet bases as counterpart self-attention in spikformer for efficient visual classification
Frontiers in Neuroscience
spiking neural network
transformer
Fourier/Wavelet transform
visual classification
computational efficiency
title Fourier or Wavelet bases as counterpart self-attention in spikformer for efficient visual classification
title_full Fourier or Wavelet bases as counterpart self-attention in spikformer for efficient visual classification
title_fullStr Fourier or Wavelet bases as counterpart self-attention in spikformer for efficient visual classification
title_full_unstemmed Fourier or Wavelet bases as counterpart self-attention in spikformer for efficient visual classification
title_short Fourier or Wavelet bases as counterpart self-attention in spikformer for efficient visual classification
title_sort fourier or wavelet bases as counterpart self attention in spikformer for efficient visual classification
topic spiking neural network
transformer
Fourier/Wavelet transform
visual classification
computational efficiency
url https://www.frontiersin.org/articles/10.3389/fnins.2024.1516868/full
work_keys_str_mv AT qingyuwang fourierorwaveletbasesascounterpartselfattentioninspikformerforefficientvisualclassification
AT qingyuwang fourierorwaveletbasesascounterpartselfattentioninspikformerforefficientvisualclassification
AT duzhenzhang fourierorwaveletbasesascounterpartselfattentioninspikformerforefficientvisualclassification
AT xinyuancai fourierorwaveletbasesascounterpartselfattentioninspikformerforefficientvisualclassification
AT tielinzhang fourierorwaveletbasesascounterpartselfattentioninspikformerforefficientvisualclassification
AT boxu fourierorwaveletbasesascounterpartselfattentioninspikformerforefficientvisualclassification