Intraoperative Surgical Navigation and Instrument Localization Using a Supervised Learning Transformer Network
Accurate tracking of surgical instruments is critical for the effectiveness of computer-assisted interventions to enhance visual information and reduce tissue injury risks. Previous approaches often modeled instrument trajectories rigidly, failing to capture the dynamic nature of surgical procedures...
Saved in:
| Main Authors: | , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
IEEE
2025-01-01
|
| Series: | IEEE Access |
| Subjects: | |
| Online Access: | https://ieeexplore.ieee.org/document/10967357/ |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Accurate tracking of surgical instruments is critical for the effectiveness of computer-assisted interventions to enhance visual information and reduce tissue injury risks. Previous approaches often modeled instrument trajectories rigidly, failing to capture the dynamic nature of surgical procedures, particularly in scenarios where instruments move outside the camera’s field of view or temporarily exit the body. Addressing these challenges, multi-instrument tracking is essential in the domain of surgical instrument detection and tracking. It plays a pivotal role in ensuring precise localization and identity preservation of instruments throughout surgical procedures. However, traditional multi-object tracking (MOT) methods, which decouple the detection and re-identification (re-ID) processes, frequently suffer from long-term reID in non-gaussian movement. To mitigate these issues, we propose SurgTrackNet, a unified framework specifically designed for surgical instrument tracking that integrates detection and tracklet association using transformer-based models. SurgTrackNet leverages a spatiotemporal memory network that stores previous observations of tracked surgical instruments, enhancing the system’s capacity to maintain consistent instrument identities and accurately predict their trajectories over time. The framework’s Instrument sequence creation network, utilizes a transformer encoder-decoder to generate instrument candidates, while a memory decoder integrates these proposals with track vectors to predict instrument locations and class, significantly minimizing the need for extensive post-processing. The memory encoding-decoding mechanism aggregates instrument features and associates them across frames, enabling consistent and robust detection and identity preservation of instruments throughout the surgical procedure. SurgTrackNet outperforms existing models, including other Transformer-based approaches, with a 78.7 MOTA and 75.5 IDF1 score on the CholecTrack20 dataset, and a 76.3 MOTA and 78.1 IDF1 score on the ATLAS Dione dataset. |
|---|---|
| ISSN: | 2169-3536 |