Managing Timing Uncertainties in Worst-Case Design of Machine Learning Applications

Achieving reliable worst-case timing poses a challenge for modern, high-performance, commercial off-the-shelf hardware platforms deployed for industrial applications. Particularly for safety-critical industrial systems, e.g., robot-human collaboration using convolutional neural networks, timing must...

Full description

Saved in:
Bibliographic Details
Main Authors: Robin Hapka, Rolf Ernst
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/11091311/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849421283713875968
author Robin Hapka
Rolf Ernst
author_facet Robin Hapka
Rolf Ernst
author_sort Robin Hapka
collection DOAJ
description Achieving reliable worst-case timing poses a challenge for modern, high-performance, commercial off-the-shelf hardware platforms deployed for industrial applications. Particularly for safety-critical industrial systems, e.g., robot-human collaboration using convolutional neural networks, timing must be considered to operate safely. Although state-of-the-art real-time operating systems and isolation techniques provide predictable timing, they restrict design decisions as many modern hardware platforms are not supported, introducing serious performance penalties. Besides traditional timing considerations, such as the number of cache misses, process variations due to chip manufacturing become more prominent, causing chips from the same model series to exhibit different timing behavior. This circumstance complicates achieving reliable timing on a system level even further. In this work, we present examples of physical variations using most recent hardware platforms, including 12th-generation Intel-based embedded hardware and GPU-based platforms using an Nvidia Jetson AGX Xavier. We elaborate on a potential solution from the avionics domain, called Timing Diversity, which allows for masking unexpected occurrences of worst-case timing behavior by exploiting modular redundancy inherent to safety-critical systems. A key result of our work is that Timing Diversity enables the safe usage of high-performance platforms such as the Nvidia Jetson AGX Xavier, consequently yielding a significant performance boost of nearly 6x.
format Article
id doaj-art-129ba53bf7c54fbc8bd828f74bdf6e80
institution Kabale University
issn 2169-3536
language English
publishDate 2025-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj-art-129ba53bf7c54fbc8bd828f74bdf6e802025-08-20T03:31:30ZengIEEEIEEE Access2169-35362025-01-011313094113095210.1109/ACCESS.2025.359198611091311Managing Timing Uncertainties in Worst-Case Design of Machine Learning ApplicationsRobin Hapka0https://orcid.org/0000-0002-5201-3212Rolf Ernst1https://orcid.org/0000-0003-2414-9566Institute of Computer and Network Engineering, Technische Universität Braunschweig (TU Braunschweig), Braunschweig, GermanyInstitute of Computer and Network Engineering, Technische Universität Braunschweig (TU Braunschweig), Braunschweig, GermanyAchieving reliable worst-case timing poses a challenge for modern, high-performance, commercial off-the-shelf hardware platforms deployed for industrial applications. Particularly for safety-critical industrial systems, e.g., robot-human collaboration using convolutional neural networks, timing must be considered to operate safely. Although state-of-the-art real-time operating systems and isolation techniques provide predictable timing, they restrict design decisions as many modern hardware platforms are not supported, introducing serious performance penalties. Besides traditional timing considerations, such as the number of cache misses, process variations due to chip manufacturing become more prominent, causing chips from the same model series to exhibit different timing behavior. This circumstance complicates achieving reliable timing on a system level even further. In this work, we present examples of physical variations using most recent hardware platforms, including 12th-generation Intel-based embedded hardware and GPU-based platforms using an Nvidia Jetson AGX Xavier. We elaborate on a potential solution from the avionics domain, called Timing Diversity, which allows for masking unexpected occurrences of worst-case timing behavior by exploiting modular redundancy inherent to safety-critical systems. A key result of our work is that Timing Diversity enables the safe usage of high-performance platforms such as the Nvidia Jetson AGX Xavier, consequently yielding a significant performance boost of nearly 6x.https://ieeexplore.ieee.org/document/11091311/High-performance hardwaremachine-learningneural networksreal-timereliability
spellingShingle Robin Hapka
Rolf Ernst
Managing Timing Uncertainties in Worst-Case Design of Machine Learning Applications
IEEE Access
High-performance hardware
machine-learning
neural networks
real-time
reliability
title Managing Timing Uncertainties in Worst-Case Design of Machine Learning Applications
title_full Managing Timing Uncertainties in Worst-Case Design of Machine Learning Applications
title_fullStr Managing Timing Uncertainties in Worst-Case Design of Machine Learning Applications
title_full_unstemmed Managing Timing Uncertainties in Worst-Case Design of Machine Learning Applications
title_short Managing Timing Uncertainties in Worst-Case Design of Machine Learning Applications
title_sort managing timing uncertainties in worst case design of machine learning applications
topic High-performance hardware
machine-learning
neural networks
real-time
reliability
url https://ieeexplore.ieee.org/document/11091311/
work_keys_str_mv AT robinhapka managingtiminguncertaintiesinworstcasedesignofmachinelearningapplications
AT rolfernst managingtiminguncertaintiesinworstcasedesignofmachinelearningapplications