Digitization of Medical Device Displays Using Deep Learning Models: A Comparative Study

With the growing number of patients living with chronic conditions, there is an increasing need for efficient systems that can automatically capture and convert medical device readings into digital data, particularly in home-based care settings. However, most home-based medical devices are closed sy...

Full description

Saved in:
Bibliographic Details
Main Authors: Pedro Ferreira, Pedro Lobo, Filipa Reis, João L. Vilaça, Pedro Morais
Format: Article
Language:English
Published: MDPI AG 2025-05-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/15/10/5436
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:With the growing number of patients living with chronic conditions, there is an increasing need for efficient systems that can automatically capture and convert medical device readings into digital data, particularly in home-based care settings. However, most home-based medical devices are closed systems that do not support straightforward automatic data export and often require complex connections to access or transmit patient information. Since most of these devices display clinical information on a screen, this research explores how a standard smartphone camera, combined with artificial intelligence, can be used to automatically extract the displayed data in a simple and non-intrusive way. In particular, this study provides a comparative analysis of several You Only Look Once (YOLO) and Single Shot MultiBox Detector (SSD) models to evaluate their effectiveness in detecting and recognizing the readings on medical device displays. In addition to these comparisons, we also explore a hybrid approach that combines the YOLOv8l model for object detection with a Convolutional Neural Network (CNN) for classification. Several iterations of the aforementioned models were tested, using image resolutions of 320 × 320 and 640 × 640. The performance was assessed using metrics such as precision, recall, mean average precision at 0.5 Intersection over Union (mAP@50), and frames per second (FPS). The results show that YOLOv8l (640) achieved the highest mAP@50 of 0.979, but at a lower inference speed (13.20 FPS), while YOLOv8n (320) offered the fastest inference (129.79 FPS) with a reduction in mean average precision (0.786). Combining YOLOv8l with a CNN classifier resulted in a slight reduction in overall accuracy (0.96) when compared to the standalone model (0.98). While the results are promising, the study acknowledges certain limitations, including dataset-specific biases, controlled acquisition settings, and challenges in adapting to real-world scenarios. Nevertheless, the comparative analysis offers valuable insights into the trade-off between inference time and accuracy, helping guide the selection of the most suitable model based on the specific demands of the intended scanning application.
ISSN:2076-3417