ViPeR: Vision-Based Surgical Phase Recognition
Surgical phase recognition is a critical, yet challenging, problem in computer vision, with significant implications for automated surgical training, intraoperative assistance, and workflow optimization. However, the development of robust models is hindered by the scarcity of well-annotated medical...
Saved in:
| Main Authors: | , , , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
IEEE
2025-01-01
|
| Series: | IEEE Access |
| Subjects: | |
| Online Access: | https://ieeexplore.ieee.org/document/11078244/ |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Surgical phase recognition is a critical, yet challenging, problem in computer vision, with significant implications for automated surgical training, intraoperative assistance, and workflow optimization. However, the development of robust models is hindered by the scarcity of well-annotated medical datasets and the complexity of surgical workflows, which exhibit substantial inter- and intra-procedural variations. To address these challenges, we introduce UroSlice, a novel complex dataset that focuses on nephrectomy surgeries, capturing both radical and partial procedures performed using robotic-assisted techniques. In order to address the task of phase recognition in these videos, we propose a novel model named “ViPeR” (Vision-based Surgical Phase Recognition). Our model incorporates hierarchical dilated temporal convolution layers and inter-layer residual connections to capture temporal correlations at both fine and coarse granularities. Experimental evaluations validates that our approach achieves state-of-the-art performance of 91.7% on Cholec80 and 66.4% on UroSlice - a more challenging dataset due to its irregular phase durations, non-standardized phase order, and a smaller sample size. The code and dataset are publicly available at: <uri>https://github.com/soumyadeepchandra/ViPeR</uri> |
|---|---|
| ISSN: | 2169-3536 |