Automating multi-task learning on optical neural networks with weight sharing and physical rotation

Abstract The democratization of AI encourages multi-task learning (MTL), demanding more parameters and processing time. To achieve highly energy-efficient MTL, Diffractive Optical Neural Networks (DONNs) have garnered attention due to extremely low energy and high computation speed. However, impleme...

Full description

Saved in:
Bibliographic Details
Main Authors: Shanglin Zhou, Yingjie Li, Weilu Gao, Cunxi Yu, Caiwen Ding
Format: Article
Language:English
Published: Nature Portfolio 2025-04-01
Series:Scientific Reports
Online Access:https://doi.org/10.1038/s41598-025-97262-2
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Abstract The democratization of AI encourages multi-task learning (MTL), demanding more parameters and processing time. To achieve highly energy-efficient MTL, Diffractive Optical Neural Networks (DONNs) have garnered attention due to extremely low energy and high computation speed. However, implementing MTL on DONNs requires manually reconfiguring & replacing layers, and rebuilding & duplicating the physical optical systems. To overcome the challenges, we propose LUMEN-PRO, an automated MTL framework using DONNs. We first propose to automate MTL utilizing an arbitrary backbone DONN and a set of tasks, resulting in a high-accuracy multi-task DONN model with small memory footprint that surpasses existing MTL. Second, we leverage the rotability of the physical optical system and replace task-specific layers with rotation of the corresponding shared layers. This replacement eliminates the storage requirement of task-specific layers, further optimizing the memory footprint. LUMEN-PRO provides flexibility in identifying optimal sharing patterns across diverse datasets, facilitating the search for highly energy-efficient DONNs. Experiments show that LUMEN-PRO provides up to 49.58% higher accuracy and 4× better cost efficiency than single-task and existing DONN approaches. It achieves memory lower bound of MTL, with memory efficiency matching single-task models. Compared to IBM-TrueNorth, LUMEN-PRO achieves an $$8.78\times$$ energy efficiency gain, while it matches Nanophotonic in efficiency but surpasses it in per-operator efficiency due to its larger system.
ISSN:2045-2322