Comprehensive Analysis of Neural Network Inference on Embedded Systems: Response Time, Calibration, and Model Optimisation

The response time of Artificial Neural Network (ANN) inference is critical in embedded systems processing sensor data close to the source. This is particularly important in applications such as predictive maintenance, which rely on timely state change predictions. This study enables estimation of mo...

Full description

Saved in:

Bibliographic Details
Main Authors:	Patrick Huber, Ulrich Göhner, Mario Trapp, Jonathan Zender, Rabea Lichtenberg
Format:	Article
Language:	English
Published:	MDPI AG 2025-08-01
Series:	Sensors
Subjects:	ANN inference Tensorflow Lite embedded systems benchmarking model calibration response times
Online Access:	https://www.mdpi.com/1424-8220/25/15/4769
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	The response time of Artificial Neural Network (ANN) inference is critical in embedded systems processing sensor data close to the source. This is particularly important in applications such as predictive maintenance, which rely on timely state change predictions. This study enables estimation of model response times based on the underlying platform, highlighting the importance of benchmarking generic ANN applications on edge devices. We analyze the impact of network parameters, activation functions, and single- versus multi-threading on response times. Additionally, potential hardware-related influences, such as clock rate variances, are discussed. The results underline the complexity of task partitioning and scheduling strategies, stressing the need for precise parameter coordination to optimise performance across platforms. This study shows that cutting-edge frameworks do not necessarily perform the required operations automatically for all configurations, which may negatively impact performance. This paper further investigates the influence of network structure on model calibration, quantified using the Expected Calibration Error (ECE), and the limits of potential optimisation opportunities. It also examines the effects of model conversion to Tensorflow Lite (TFLite), highlighting the necessity of considering both performance and calibration when deploying models on embedded systems.
ISSN:	1424-8220

Comprehensive Analysis of Neural Network Inference on Embedded Systems: Response Time, Calibration, and Model Optimisation

Similar Items