Visual Prompt Selection Framework for Real-Time Object Detection and Interactive Segmentation in Augmented Reality Applications

This study presents a novel visual prompt selection framework for augmented reality (AR) applications that integrates advanced object detection and image segmentation techniques. The framework is designed to enhance user interactions and improve the accuracy of foreground–background separation in AR...

Full description

Saved in:
Bibliographic Details
Main Authors: Eungyeol Song, Doeun Oh, Beom-Seok Oh
Format: Article
Language:English
Published: MDPI AG 2024-11-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/14/22/10502
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850149878152298496
author Eungyeol Song
Doeun Oh
Beom-Seok Oh
author_facet Eungyeol Song
Doeun Oh
Beom-Seok Oh
author_sort Eungyeol Song
collection DOAJ
description This study presents a novel visual prompt selection framework for augmented reality (AR) applications that integrates advanced object detection and image segmentation techniques. The framework is designed to enhance user interactions and improve the accuracy of foreground–background separation in AR environments, making AR experiences more immersive and precise. We evaluated six state-of-the-art object detectors (DETR, DINO, CoDETR, YOLOv5, YOLOv8, and YOLO-NAS) in combination with a prompt segmentation model using the DAVIS 2017 validation dataset. The results show that the combination of YOLO-NAS-L and SAM achieved the best performance with a J&F score of 70%, while DINO-scale4-swin had the lowest score of 57.5%. This 12.5% performance gap highlights the significant contribution of user-provided regions of interest (ROIs) to segmentation outcomes, emphasizing the importance of interactive user input in enhancing accuracy. Our framework supports fast prompt processing and accurate mask generation, allowing users to refine digital overlays interactively, thereby improving both the quality of AR experiences and overall user satisfaction. Additionally, the framework enables the automatic detection of moving objects, providing a more efficient alternative to traditional manual selection interfaces in AR devices. This capability is particularly valuable in dynamic AR scenarios, where seamless user interaction is crucial.
format Article
id doaj-art-b4b6de3bf6d242da869371e818a1db2c
institution OA Journals
issn 2076-3417
language English
publishDate 2024-11-01
publisher MDPI AG
record_format Article
series Applied Sciences
spelling doaj-art-b4b6de3bf6d242da869371e818a1db2c2025-08-20T02:26:45ZengMDPI AGApplied Sciences2076-34172024-11-0114221050210.3390/app142210502Visual Prompt Selection Framework for Real-Time Object Detection and Interactive Segmentation in Augmented Reality ApplicationsEungyeol Song0Doeun Oh1Beom-Seok Oh2Department of Applied Artificial Intelligence, Seoul National University of Science and Technology, Seoul 01811, Republic of KoreaResearch and Development Department, Codevision Inc., Seoul 03722, Republic of KoreaDepartment of Applied Artificial Intelligence, Seoul National University of Science and Technology, Seoul 01811, Republic of KoreaThis study presents a novel visual prompt selection framework for augmented reality (AR) applications that integrates advanced object detection and image segmentation techniques. The framework is designed to enhance user interactions and improve the accuracy of foreground–background separation in AR environments, making AR experiences more immersive and precise. We evaluated six state-of-the-art object detectors (DETR, DINO, CoDETR, YOLOv5, YOLOv8, and YOLO-NAS) in combination with a prompt segmentation model using the DAVIS 2017 validation dataset. The results show that the combination of YOLO-NAS-L and SAM achieved the best performance with a J&F score of 70%, while DINO-scale4-swin had the lowest score of 57.5%. This 12.5% performance gap highlights the significant contribution of user-provided regions of interest (ROIs) to segmentation outcomes, emphasizing the importance of interactive user input in enhancing accuracy. Our framework supports fast prompt processing and accurate mask generation, allowing users to refine digital overlays interactively, thereby improving both the quality of AR experiences and overall user satisfaction. Additionally, the framework enables the automatic detection of moving objects, providing a more efficient alternative to traditional manual selection interfaces in AR devices. This capability is particularly valuable in dynamic AR scenarios, where seamless user interaction is crucial.https://www.mdpi.com/2076-3417/14/22/10502image segmentationobject detectionuser-interactive systemaugmented reality
spellingShingle Eungyeol Song
Doeun Oh
Beom-Seok Oh
Visual Prompt Selection Framework for Real-Time Object Detection and Interactive Segmentation in Augmented Reality Applications
Applied Sciences
image segmentation
object detection
user-interactive system
augmented reality
title Visual Prompt Selection Framework for Real-Time Object Detection and Interactive Segmentation in Augmented Reality Applications
title_full Visual Prompt Selection Framework for Real-Time Object Detection and Interactive Segmentation in Augmented Reality Applications
title_fullStr Visual Prompt Selection Framework for Real-Time Object Detection and Interactive Segmentation in Augmented Reality Applications
title_full_unstemmed Visual Prompt Selection Framework for Real-Time Object Detection and Interactive Segmentation in Augmented Reality Applications
title_short Visual Prompt Selection Framework for Real-Time Object Detection and Interactive Segmentation in Augmented Reality Applications
title_sort visual prompt selection framework for real time object detection and interactive segmentation in augmented reality applications
topic image segmentation
object detection
user-interactive system
augmented reality
url https://www.mdpi.com/2076-3417/14/22/10502
work_keys_str_mv AT eungyeolsong visualpromptselectionframeworkforrealtimeobjectdetectionandinteractivesegmentationinaugmentedrealityapplications
AT doeunoh visualpromptselectionframeworkforrealtimeobjectdetectionandinteractivesegmentationinaugmentedrealityapplications
AT beomseokoh visualpromptselectionframeworkforrealtimeobjectdetectionandinteractivesegmentationinaugmentedrealityapplications