Robustifying Routers Against Input Perturbations for Sparse Mixture-of-Experts Vision Transformers

Mixture of experts with a sparse expert selection rule has been gaining much attention recently because of its scalability without compromising inference time. However, unlike standard neural networks, sparse mixture-of-experts models inherently exhibit discontinuities in the output space, which may...

Full description

Saved in:
Bibliographic Details
Main Authors: Masahiro Kada, Ryota Yoshihashi, Satoshi Ikehata, Rei Kawakami, Ikuro Sato
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Open Journal of Signal Processing
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10858379/
Tags: Add Tag
No Tags, Be the first to tag this record!