Intraoperative Surgical Navigation and Instrument Localization Using a Supervised Learning Transformer Network
Accurate tracking of surgical instruments is critical for the effectiveness of computer-assisted interventions to enhance visual information and reduce tissue injury risks. Previous approaches often modeled instrument trajectories rigidly, failing to capture the dynamic nature of surgical procedures...
Saved in:
| Main Authors: | , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
IEEE
2025-01-01
|
| Series: | IEEE Access |
| Subjects: | |
| Online Access: | https://ieeexplore.ieee.org/document/10967357/ |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850132098043609088 |
|---|---|
| author | Junjun Huang Tianran He Juan Xu Weiting Wu Wei Wu |
| author_facet | Junjun Huang Tianran He Juan Xu Weiting Wu Wei Wu |
| author_sort | Junjun Huang |
| collection | DOAJ |
| description | Accurate tracking of surgical instruments is critical for the effectiveness of computer-assisted interventions to enhance visual information and reduce tissue injury risks. Previous approaches often modeled instrument trajectories rigidly, failing to capture the dynamic nature of surgical procedures, particularly in scenarios where instruments move outside the camera’s field of view or temporarily exit the body. Addressing these challenges, multi-instrument tracking is essential in the domain of surgical instrument detection and tracking. It plays a pivotal role in ensuring precise localization and identity preservation of instruments throughout surgical procedures. However, traditional multi-object tracking (MOT) methods, which decouple the detection and re-identification (re-ID) processes, frequently suffer from long-term reID in non-gaussian movement. To mitigate these issues, we propose SurgTrackNet, a unified framework specifically designed for surgical instrument tracking that integrates detection and tracklet association using transformer-based models. SurgTrackNet leverages a spatiotemporal memory network that stores previous observations of tracked surgical instruments, enhancing the system’s capacity to maintain consistent instrument identities and accurately predict their trajectories over time. The framework’s Instrument sequence creation network, utilizes a transformer encoder-decoder to generate instrument candidates, while a memory decoder integrates these proposals with track vectors to predict instrument locations and class, significantly minimizing the need for extensive post-processing. The memory encoding-decoding mechanism aggregates instrument features and associates them across frames, enabling consistent and robust detection and identity preservation of instruments throughout the surgical procedure. SurgTrackNet outperforms existing models, including other Transformer-based approaches, with a 78.7 MOTA and 75.5 IDF1 score on the CholecTrack20 dataset, and a 76.3 MOTA and 78.1 IDF1 score on the ATLAS Dione dataset. |
| format | Article |
| id | doaj-art-7b4f3c98f20f446bb535ce0fe5dc4580 |
| institution | OA Journals |
| issn | 2169-3536 |
| language | English |
| publishDate | 2025-01-01 |
| publisher | IEEE |
| record_format | Article |
| series | IEEE Access |
| spelling | doaj-art-7b4f3c98f20f446bb535ce0fe5dc45802025-08-20T02:32:16ZengIEEEIEEE Access2169-35362025-01-0113705487056610.1109/ACCESS.2025.356176910967357Intraoperative Surgical Navigation and Instrument Localization Using a Supervised Learning Transformer NetworkJunjun Huang0https://orcid.org/0000-0001-6407-3530Tianran He1https://orcid.org/0000-0002-4118-0736Juan Xu2Weiting Wu3https://orcid.org/0009-0001-6657-7764Wei Wu4https://orcid.org/0009-0001-3584-4129Department of Artificial Intelligence, Faculty of Computer Science and Information Technology, Universiti Malaya, Kuala Lumpur, MalaysiaSchool of Electronic and Information Engineering, Tongji University, Shanghai, ChinaNingbo Xinwell Medical Technology Company Ltd., Cixi, Zhejiang, ChinaNingbo Xinwell Medical Technology Company Ltd., Cixi, Zhejiang, ChinaThe First Affiliated Hospital of Wenzhou Medical University, Wenzhou, Zhejiang, ChinaAccurate tracking of surgical instruments is critical for the effectiveness of computer-assisted interventions to enhance visual information and reduce tissue injury risks. Previous approaches often modeled instrument trajectories rigidly, failing to capture the dynamic nature of surgical procedures, particularly in scenarios where instruments move outside the camera’s field of view or temporarily exit the body. Addressing these challenges, multi-instrument tracking is essential in the domain of surgical instrument detection and tracking. It plays a pivotal role in ensuring precise localization and identity preservation of instruments throughout surgical procedures. However, traditional multi-object tracking (MOT) methods, which decouple the detection and re-identification (re-ID) processes, frequently suffer from long-term reID in non-gaussian movement. To mitigate these issues, we propose SurgTrackNet, a unified framework specifically designed for surgical instrument tracking that integrates detection and tracklet association using transformer-based models. SurgTrackNet leverages a spatiotemporal memory network that stores previous observations of tracked surgical instruments, enhancing the system’s capacity to maintain consistent instrument identities and accurately predict their trajectories over time. The framework’s Instrument sequence creation network, utilizes a transformer encoder-decoder to generate instrument candidates, while a memory decoder integrates these proposals with track vectors to predict instrument locations and class, significantly minimizing the need for extensive post-processing. The memory encoding-decoding mechanism aggregates instrument features and associates them across frames, enabling consistent and robust detection and identity preservation of instruments throughout the surgical procedure. SurgTrackNet outperforms existing models, including other Transformer-based approaches, with a 78.7 MOTA and 75.5 IDF1 score on the CholecTrack20 dataset, and a 76.3 MOTA and 78.1 IDF1 score on the ATLAS Dione dataset.https://ieeexplore.ieee.org/document/10967357/Surgical instrument detectioninstruments detectionmedical imagingvision transformermemory network |
| spellingShingle | Junjun Huang Tianran He Juan Xu Weiting Wu Wei Wu Intraoperative Surgical Navigation and Instrument Localization Using a Supervised Learning Transformer Network IEEE Access Surgical instrument detection instruments detection medical imaging vision transformer memory network |
| title | Intraoperative Surgical Navigation and Instrument Localization Using a Supervised Learning Transformer Network |
| title_full | Intraoperative Surgical Navigation and Instrument Localization Using a Supervised Learning Transformer Network |
| title_fullStr | Intraoperative Surgical Navigation and Instrument Localization Using a Supervised Learning Transformer Network |
| title_full_unstemmed | Intraoperative Surgical Navigation and Instrument Localization Using a Supervised Learning Transformer Network |
| title_short | Intraoperative Surgical Navigation and Instrument Localization Using a Supervised Learning Transformer Network |
| title_sort | intraoperative surgical navigation and instrument localization using a supervised learning transformer network |
| topic | Surgical instrument detection instruments detection medical imaging vision transformer memory network |
| url | https://ieeexplore.ieee.org/document/10967357/ |
| work_keys_str_mv | AT junjunhuang intraoperativesurgicalnavigationandinstrumentlocalizationusingasupervisedlearningtransformernetwork AT tianranhe intraoperativesurgicalnavigationandinstrumentlocalizationusingasupervisedlearningtransformernetwork AT juanxu intraoperativesurgicalnavigationandinstrumentlocalizationusingasupervisedlearningtransformernetwork AT weitingwu intraoperativesurgicalnavigationandinstrumentlocalizationusingasupervisedlearningtransformernetwork AT weiwu intraoperativesurgicalnavigationandinstrumentlocalizationusingasupervisedlearningtransformernetwork |