Intraoperative Surgical Navigation and Instrument Localization Using a Supervised Learning Transformer Network

Accurate tracking of surgical instruments is critical for the effectiveness of computer-assisted interventions to enhance visual information and reduce tissue injury risks. Previous approaches often modeled instrument trajectories rigidly, failing to capture the dynamic nature of surgical procedures...

Full description

Saved in:
Bibliographic Details
Main Authors: Junjun Huang, Tianran He, Juan Xu, Weiting Wu, Wei Wu
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10967357/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850132098043609088
author Junjun Huang
Tianran He
Juan Xu
Weiting Wu
Wei Wu
author_facet Junjun Huang
Tianran He
Juan Xu
Weiting Wu
Wei Wu
author_sort Junjun Huang
collection DOAJ
description Accurate tracking of surgical instruments is critical for the effectiveness of computer-assisted interventions to enhance visual information and reduce tissue injury risks. Previous approaches often modeled instrument trajectories rigidly, failing to capture the dynamic nature of surgical procedures, particularly in scenarios where instruments move outside the camera’s field of view or temporarily exit the body. Addressing these challenges, multi-instrument tracking is essential in the domain of surgical instrument detection and tracking. It plays a pivotal role in ensuring precise localization and identity preservation of instruments throughout surgical procedures. However, traditional multi-object tracking (MOT) methods, which decouple the detection and re-identification (re-ID) processes, frequently suffer from long-term reID in non-gaussian movement. To mitigate these issues, we propose SurgTrackNet, a unified framework specifically designed for surgical instrument tracking that integrates detection and tracklet association using transformer-based models. SurgTrackNet leverages a spatiotemporal memory network that stores previous observations of tracked surgical instruments, enhancing the system’s capacity to maintain consistent instrument identities and accurately predict their trajectories over time. The framework’s Instrument sequence creation network, utilizes a transformer encoder-decoder to generate instrument candidates, while a memory decoder integrates these proposals with track vectors to predict instrument locations and class, significantly minimizing the need for extensive post-processing. The memory encoding-decoding mechanism aggregates instrument features and associates them across frames, enabling consistent and robust detection and identity preservation of instruments throughout the surgical procedure. SurgTrackNet outperforms existing models, including other Transformer-based approaches, with a 78.7 MOTA and 75.5 IDF1 score on the CholecTrack20 dataset, and a 76.3 MOTA and 78.1 IDF1 score on the ATLAS Dione dataset.
format Article
id doaj-art-7b4f3c98f20f446bb535ce0fe5dc4580
institution OA Journals
issn 2169-3536
language English
publishDate 2025-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj-art-7b4f3c98f20f446bb535ce0fe5dc45802025-08-20T02:32:16ZengIEEEIEEE Access2169-35362025-01-0113705487056610.1109/ACCESS.2025.356176910967357Intraoperative Surgical Navigation and Instrument Localization Using a Supervised Learning Transformer NetworkJunjun Huang0https://orcid.org/0000-0001-6407-3530Tianran He1https://orcid.org/0000-0002-4118-0736Juan Xu2Weiting Wu3https://orcid.org/0009-0001-6657-7764Wei Wu4https://orcid.org/0009-0001-3584-4129Department of Artificial Intelligence, Faculty of Computer Science and Information Technology, Universiti Malaya, Kuala Lumpur, MalaysiaSchool of Electronic and Information Engineering, Tongji University, Shanghai, ChinaNingbo Xinwell Medical Technology Company Ltd., Cixi, Zhejiang, ChinaNingbo Xinwell Medical Technology Company Ltd., Cixi, Zhejiang, ChinaThe First Affiliated Hospital of Wenzhou Medical University, Wenzhou, Zhejiang, ChinaAccurate tracking of surgical instruments is critical for the effectiveness of computer-assisted interventions to enhance visual information and reduce tissue injury risks. Previous approaches often modeled instrument trajectories rigidly, failing to capture the dynamic nature of surgical procedures, particularly in scenarios where instruments move outside the camera’s field of view or temporarily exit the body. Addressing these challenges, multi-instrument tracking is essential in the domain of surgical instrument detection and tracking. It plays a pivotal role in ensuring precise localization and identity preservation of instruments throughout surgical procedures. However, traditional multi-object tracking (MOT) methods, which decouple the detection and re-identification (re-ID) processes, frequently suffer from long-term reID in non-gaussian movement. To mitigate these issues, we propose SurgTrackNet, a unified framework specifically designed for surgical instrument tracking that integrates detection and tracklet association using transformer-based models. SurgTrackNet leverages a spatiotemporal memory network that stores previous observations of tracked surgical instruments, enhancing the system’s capacity to maintain consistent instrument identities and accurately predict their trajectories over time. The framework’s Instrument sequence creation network, utilizes a transformer encoder-decoder to generate instrument candidates, while a memory decoder integrates these proposals with track vectors to predict instrument locations and class, significantly minimizing the need for extensive post-processing. The memory encoding-decoding mechanism aggregates instrument features and associates them across frames, enabling consistent and robust detection and identity preservation of instruments throughout the surgical procedure. SurgTrackNet outperforms existing models, including other Transformer-based approaches, with a 78.7 MOTA and 75.5 IDF1 score on the CholecTrack20 dataset, and a 76.3 MOTA and 78.1 IDF1 score on the ATLAS Dione dataset.https://ieeexplore.ieee.org/document/10967357/Surgical instrument detectioninstruments detectionmedical imagingvision transformermemory network
spellingShingle Junjun Huang
Tianran He
Juan Xu
Weiting Wu
Wei Wu
Intraoperative Surgical Navigation and Instrument Localization Using a Supervised Learning Transformer Network
IEEE Access
Surgical instrument detection
instruments detection
medical imaging
vision transformer
memory network
title Intraoperative Surgical Navigation and Instrument Localization Using a Supervised Learning Transformer Network
title_full Intraoperative Surgical Navigation and Instrument Localization Using a Supervised Learning Transformer Network
title_fullStr Intraoperative Surgical Navigation and Instrument Localization Using a Supervised Learning Transformer Network
title_full_unstemmed Intraoperative Surgical Navigation and Instrument Localization Using a Supervised Learning Transformer Network
title_short Intraoperative Surgical Navigation and Instrument Localization Using a Supervised Learning Transformer Network
title_sort intraoperative surgical navigation and instrument localization using a supervised learning transformer network
topic Surgical instrument detection
instruments detection
medical imaging
vision transformer
memory network
url https://ieeexplore.ieee.org/document/10967357/
work_keys_str_mv AT junjunhuang intraoperativesurgicalnavigationandinstrumentlocalizationusingasupervisedlearningtransformernetwork
AT tianranhe intraoperativesurgicalnavigationandinstrumentlocalizationusingasupervisedlearningtransformernetwork
AT juanxu intraoperativesurgicalnavigationandinstrumentlocalizationusingasupervisedlearningtransformernetwork
AT weitingwu intraoperativesurgicalnavigationandinstrumentlocalizationusingasupervisedlearningtransformernetwork
AT weiwu intraoperativesurgicalnavigationandinstrumentlocalizationusingasupervisedlearningtransformernetwork