Occlusion-Aware Worker Detection in Masonry Work: Performance Evaluation of YOLOv8 and SAMURAI

This study evaluates the performance of You Only Look Once version 8 (YOLOv8) and a SAM-based unified and robust zero-shot visual tracker with motion-aware instance-level memory (SAMURAI) for worker detection in masonry construction environments under varying occlusion conditions. Computer vision-ba...

Full description

Saved in:
Bibliographic Details
Main Authors: Seonjun Yoon, Hyunsoo Kim
Format: Article
Language:English
Published: MDPI AG 2025-04-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/15/7/3991
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:This study evaluates the performance of You Only Look Once version 8 (YOLOv8) and a SAM-based unified and robust zero-shot visual tracker with motion-aware instance-level memory (SAMURAI) for worker detection in masonry construction environments under varying occlusion conditions. Computer vision-based monitoring systems are widely used in construction, but traditional object detection models struggle with occlusion, limiting their effectiveness in real-world applications. The research employed a structured experimental framework to assess both models in brick transportation and brick laying tasks across three occlusion levels: non-occlusion, partial occlusion, and severe occlusion. Results demonstrate that while YOLOv8 processes frames 2.5 to 3.5 times faster (28–32 FPS versus 9–12 FPS), SAMURAI maintains significantly higher detection accuracy, particularly under severe occlusion conditions (92.67% versus 52.67%). YOLOv8’s frame-by-frame processing results in substantial performance degradation as occlusion severity increases, whereas SAMURAI’s memory-based tracking mechanism enables persistent worker identification across frames. This comparative analysis provides valuable insights for selecting appropriate monitoring technologies based on specific construction site requirements. YOLOv8 is suitable for construction environments characterized by minimal occlusions and a high demand for real-time detection, whereas SAMURAI is more applicable to scenarios with frequent and severe occlusions that require the sustained tracking of worker activity. The selection of an appropriate model should be based on an initial assessment of environmental factors such as layout complexity, object density, and expected occlusion frequency. The findings contribute to the advancement of more reliable vision-based monitoring systems for enhancing productivity assessment and safety management in dynamic construction settings.
ISSN:2076-3417