Kinematic Adaptive Frame Recognition (KAFR): A Novel Framework for Video Segmentation via Frame Similarity and Surgical Tool Tracking

The interest in leveraging Artificial Intelligence (AI) for surgical procedures to automate analysis has witnessed a significant surge in recent years. One of the primary tools for recording surgical procedures and conducting subsequent analyses, such as performance assessment, is through videos. Ho...

Full description

Saved in:
Bibliographic Details
Main Authors: Huu Phong Nguyen, Shekhar Madhav Khairnar, Sofia Garces Palacios, Amr Al-Abbas, Melissa E. Hogg, Amer H. Zureikat, Patricio M. Polanco, Herbert J. Zeh, Ganesh Sankaranarayanan
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/11014513/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850117419031330816
author Huu Phong Nguyen
Shekhar Madhav Khairnar
Sofia Garces Palacios
Amr Al-Abbas
Melissa E. Hogg
Amer H. Zureikat
Patricio M. Polanco
Herbert J. Zeh
Ganesh Sankaranarayanan
author_facet Huu Phong Nguyen
Shekhar Madhav Khairnar
Sofia Garces Palacios
Amr Al-Abbas
Melissa E. Hogg
Amer H. Zureikat
Patricio M. Polanco
Herbert J. Zeh
Ganesh Sankaranarayanan
author_sort Huu Phong Nguyen
collection DOAJ
description The interest in leveraging Artificial Intelligence (AI) for surgical procedures to automate analysis has witnessed a significant surge in recent years. One of the primary tools for recording surgical procedures and conducting subsequent analyses, such as performance assessment, is through videos. However, these operative videos tend to be notably lengthy compared to other fields, spanning from thirty minutes to several hours, which poses a challenge for AI models to effectively learn from them. Despite this challenge, the foreseeable increase in the volume of such videos in the near future necessitates the development and implementation of innovative techniques to tackle this issue effectively. In this article, we propose a novel technique called Kinematics Adaptive Frame Recognition (KAFR) that can efficiently eliminate redundant frames to reduce dataset size and computation time while retaining useful frames to improve accuracy. Specifically, we compute the similarity between consecutive frames by tracking the movement of surgical tools. Our approach follows these steps:1) Tracking phase: a YOLOv8 model is utilized to detect tools presented in the scene, 2) Similarity phase: Similarities between consecutive frames are computed by estimating variation in the spatial positions and velocities of the tools, 3) Classification phase: An X3D CNN is trained to classify segmentation. We evaluate the effectiveness of our approach by analyzing datasets obtained through retrospective reviews of cases at two referral centers. The newly annotated Gastrojejunostomy (GJ) dataset covers procedures performed between 2017 and 2021, while the previously annotated Pancreaticojejunostomy (PJ) dataset spans from 2011 to 2022 at the same centers. In the GJ dataset, each robotic GJ video is segmented into six distinct phases. By adaptively selecting relevant frames, we achieve a <bold>tenfold</bold> reduction in the number of frames while improving <bold>accuracy</bold> by 4.32% (from 0.749 to 0.7814) and the F1 score by 0.16%. Our approach is also evaluated on the PJ dataset, demonstrating its efficacy with a fivefold reduction of data and a 2.05% accuracy improvement (from 0.8801 to 0.8982), along with 2.54% increase in F1 score (from 0.8534 to 0.8751). In addition, we also compare our approach with the state-of-the-art approaches to highlight its competitiveness in terms of performance and efficiency. Although we examined our approach on the GJ and PJ datasets for phase segmentation, this could also be applied to broader, more general surgical datasets. Furthermore, KAFR can serve as a supplement to existing approaches, enhancing their performance by reducing redundant data while retaining key information, making it a valuable addition to other AI models.
format Article
id doaj-art-733ea87e464f4cdb8b6fbca1fa6aeb83
institution OA Journals
issn 2169-3536
language English
publishDate 2025-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj-art-733ea87e464f4cdb8b6fbca1fa6aeb832025-08-20T02:36:06ZengIEEEIEEE Access2169-35362025-01-011310168110169710.1109/ACCESS.2025.357326411014513Kinematic Adaptive Frame Recognition (KAFR): A Novel Framework for Video Segmentation via Frame Similarity and Surgical Tool TrackingHuu Phong Nguyen0https://orcid.org/0000-0002-5022-0226Shekhar Madhav Khairnar1https://orcid.org/0009-0001-3670-6342Sofia Garces Palacios2https://orcid.org/0000-0002-2607-8354Amr Al-Abbas3Melissa E. Hogg4Amer H. Zureikat5Patricio M. Polanco6Herbert J. Zeh7https://orcid.org/0000-0003-2017-1934Ganesh Sankaranarayanan8https://orcid.org/0000-0003-1556-2797Department of Surgery, University of Texas Southwestern Medical Center, Dallas, TX, USADepartment of Surgery, University of Texas Southwestern Medical Center, Dallas, TX, USADepartment of Surgery, University of Texas Southwestern Medical Center, Dallas, TX, USADepartment of Surgery, University of Texas Southwestern Medical Center, Dallas, TX, USADepartment of Surgery, NorthShore University HealthSystem, Evanston, IL, USADepartment of Surgery, University of Pittsburgh Medical Center, Pittsburgh, PA, USADepartment of Surgery, University of Texas Southwestern Medical Center, Dallas, TX, USADepartment of Surgery, University of Texas Southwestern Medical Center, Dallas, TX, USADepartment of Surgery, University of Texas Southwestern Medical Center, Dallas, TX, USAThe interest in leveraging Artificial Intelligence (AI) for surgical procedures to automate analysis has witnessed a significant surge in recent years. One of the primary tools for recording surgical procedures and conducting subsequent analyses, such as performance assessment, is through videos. However, these operative videos tend to be notably lengthy compared to other fields, spanning from thirty minutes to several hours, which poses a challenge for AI models to effectively learn from them. Despite this challenge, the foreseeable increase in the volume of such videos in the near future necessitates the development and implementation of innovative techniques to tackle this issue effectively. In this article, we propose a novel technique called Kinematics Adaptive Frame Recognition (KAFR) that can efficiently eliminate redundant frames to reduce dataset size and computation time while retaining useful frames to improve accuracy. Specifically, we compute the similarity between consecutive frames by tracking the movement of surgical tools. Our approach follows these steps:1) Tracking phase: a YOLOv8 model is utilized to detect tools presented in the scene, 2) Similarity phase: Similarities between consecutive frames are computed by estimating variation in the spatial positions and velocities of the tools, 3) Classification phase: An X3D CNN is trained to classify segmentation. We evaluate the effectiveness of our approach by analyzing datasets obtained through retrospective reviews of cases at two referral centers. The newly annotated Gastrojejunostomy (GJ) dataset covers procedures performed between 2017 and 2021, while the previously annotated Pancreaticojejunostomy (PJ) dataset spans from 2011 to 2022 at the same centers. In the GJ dataset, each robotic GJ video is segmented into six distinct phases. By adaptively selecting relevant frames, we achieve a <bold>tenfold</bold> reduction in the number of frames while improving <bold>accuracy</bold> by 4.32% (from 0.749 to 0.7814) and the F1 score by 0.16%. Our approach is also evaluated on the PJ dataset, demonstrating its efficacy with a fivefold reduction of data and a 2.05% accuracy improvement (from 0.8801 to 0.8982), along with 2.54% increase in F1 score (from 0.8534 to 0.8751). In addition, we also compare our approach with the state-of-the-art approaches to highlight its competitiveness in terms of performance and efficiency. Although we examined our approach on the GJ and PJ datasets for phase segmentation, this could also be applied to broader, more general surgical datasets. Furthermore, KAFR can serve as a supplement to existing approaches, enhancing their performance by reducing redundant data while retaining key information, making it a valuable addition to other AI models.https://ieeexplore.ieee.org/document/11014513/Adaptive frame recognitionsurgical phase segmentationtool trackingconvolutional neural networksdeep learning
spellingShingle Huu Phong Nguyen
Shekhar Madhav Khairnar
Sofia Garces Palacios
Amr Al-Abbas
Melissa E. Hogg
Amer H. Zureikat
Patricio M. Polanco
Herbert J. Zeh
Ganesh Sankaranarayanan
Kinematic Adaptive Frame Recognition (KAFR): A Novel Framework for Video Segmentation via Frame Similarity and Surgical Tool Tracking
IEEE Access
Adaptive frame recognition
surgical phase segmentation
tool tracking
convolutional neural networks
deep learning
title Kinematic Adaptive Frame Recognition (KAFR): A Novel Framework for Video Segmentation via Frame Similarity and Surgical Tool Tracking
title_full Kinematic Adaptive Frame Recognition (KAFR): A Novel Framework for Video Segmentation via Frame Similarity and Surgical Tool Tracking
title_fullStr Kinematic Adaptive Frame Recognition (KAFR): A Novel Framework for Video Segmentation via Frame Similarity and Surgical Tool Tracking
title_full_unstemmed Kinematic Adaptive Frame Recognition (KAFR): A Novel Framework for Video Segmentation via Frame Similarity and Surgical Tool Tracking
title_short Kinematic Adaptive Frame Recognition (KAFR): A Novel Framework for Video Segmentation via Frame Similarity and Surgical Tool Tracking
title_sort kinematic adaptive frame recognition kafr a novel framework for video segmentation via frame similarity and surgical tool tracking
topic Adaptive frame recognition
surgical phase segmentation
tool tracking
convolutional neural networks
deep learning
url https://ieeexplore.ieee.org/document/11014513/
work_keys_str_mv AT huuphongnguyen kinematicadaptiveframerecognitionkafranovelframeworkforvideosegmentationviaframesimilarityandsurgicaltooltracking
AT shekharmadhavkhairnar kinematicadaptiveframerecognitionkafranovelframeworkforvideosegmentationviaframesimilarityandsurgicaltooltracking
AT sofiagarcespalacios kinematicadaptiveframerecognitionkafranovelframeworkforvideosegmentationviaframesimilarityandsurgicaltooltracking
AT amralabbas kinematicadaptiveframerecognitionkafranovelframeworkforvideosegmentationviaframesimilarityandsurgicaltooltracking
AT melissaehogg kinematicadaptiveframerecognitionkafranovelframeworkforvideosegmentationviaframesimilarityandsurgicaltooltracking
AT amerhzureikat kinematicadaptiveframerecognitionkafranovelframeworkforvideosegmentationviaframesimilarityandsurgicaltooltracking
AT patriciompolanco kinematicadaptiveframerecognitionkafranovelframeworkforvideosegmentationviaframesimilarityandsurgicaltooltracking
AT herbertjzeh kinematicadaptiveframerecognitionkafranovelframeworkforvideosegmentationviaframesimilarityandsurgicaltooltracking
AT ganeshsankaranarayanan kinematicadaptiveframerecognitionkafranovelframeworkforvideosegmentationviaframesimilarityandsurgicaltooltracking