Benchmarking In-Sensor Machine Learning Computing: An Extension to the MLCommons-Tiny Suite
This paper proposes a new benchmark specifically designed for in-sensor digital machine learning computing to meet an ultra-low embedded memory requirement. With the exponential growth of edge devices, efficient local processing is essential to mitigate economic costs, latency, and privacy concerns...
Saved in:
| Main Authors: | , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2024-10-01
|
| Series: | Information |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2078-2489/15/11/674 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850267204472274944 |
|---|---|
| author | Fabrizio Maria Aymone Danilo Pietro Pau |
| author_facet | Fabrizio Maria Aymone Danilo Pietro Pau |
| author_sort | Fabrizio Maria Aymone |
| collection | DOAJ |
| description | This paper proposes a new benchmark specifically designed for in-sensor digital machine learning computing to meet an ultra-low embedded memory requirement. With the exponential growth of edge devices, efficient local processing is essential to mitigate economic costs, latency, and privacy concerns associated with the centralized cloud processing. Emerging intelligent sensors equipped with computing assets to run neural network inferences and embedded in the same package, which hosts the sensing elements, present new challenges due to their limited memory resources and computational skills. This benchmark evaluates models trained with Quantization Aware Training (QAT) and compares their performance with Post-Training Quantization (PTQ) across three use cases: Human Activity Recognition (HAR) by means of the SHL dataset, Physical Activity Monitoring (PAM) by means of the PAMAP2 dataset, and superficial electromyography (sEMG) regression with the NINAPRO DB8 dataset. The results demonstrate the effectiveness of QAT over PTQ in most scenarios, highlighting the potential for deploying advanced AI models on highly resource-constrained sensors. The INT8 versions of the models always outperformed their FP32, regarding memory and latency reductions, except for the activations for CNN. The CNN model exhibited reduced memory usage and latency with respect to its Dense counterpart, allowing it to meet the stringent 8KiB data RAM and 32 KiB program RAM limits of the ISPU. The TCN model proved to be too large to fit within the memory constraints of the ISPU, primarily due to its greater capacity in terms of number of parameters, designed for processing more complex signals like EMG. This benchmark aims to guide the development of efficient AI solutions for In-Sensor Machine Learning Computing, fostering innovation in the field of Edge AI benchmarking, such as the one conducted by the MLCommons-Tiny working group. |
| format | Article |
| id | doaj-art-81966a7de3c94496b76a1b7a89a73b5b |
| institution | OA Journals |
| issn | 2078-2489 |
| language | English |
| publishDate | 2024-10-01 |
| publisher | MDPI AG |
| record_format | Article |
| series | Information |
| spelling | doaj-art-81966a7de3c94496b76a1b7a89a73b5b2025-08-20T01:53:53ZengMDPI AGInformation2078-24892024-10-01151167410.3390/info15110674Benchmarking In-Sensor Machine Learning Computing: An Extension to the MLCommons-Tiny SuiteFabrizio Maria Aymone0Danilo Pietro Pau1System Research and Applications, STMicroelectronics, Business Center Colleoni, Building Andromeda 3, at the 7th Floor, Via Cardano 20, 20864 Agrate Brianza, ItalySystem Research and Applications, STMicroelectronics, Business Center Colleoni, Building Andromeda 3, at the 7th Floor, Via Cardano 20, 20864 Agrate Brianza, ItalyThis paper proposes a new benchmark specifically designed for in-sensor digital machine learning computing to meet an ultra-low embedded memory requirement. With the exponential growth of edge devices, efficient local processing is essential to mitigate economic costs, latency, and privacy concerns associated with the centralized cloud processing. Emerging intelligent sensors equipped with computing assets to run neural network inferences and embedded in the same package, which hosts the sensing elements, present new challenges due to their limited memory resources and computational skills. This benchmark evaluates models trained with Quantization Aware Training (QAT) and compares their performance with Post-Training Quantization (PTQ) across three use cases: Human Activity Recognition (HAR) by means of the SHL dataset, Physical Activity Monitoring (PAM) by means of the PAMAP2 dataset, and superficial electromyography (sEMG) regression with the NINAPRO DB8 dataset. The results demonstrate the effectiveness of QAT over PTQ in most scenarios, highlighting the potential for deploying advanced AI models on highly resource-constrained sensors. The INT8 versions of the models always outperformed their FP32, regarding memory and latency reductions, except for the activations for CNN. The CNN model exhibited reduced memory usage and latency with respect to its Dense counterpart, allowing it to meet the stringent 8KiB data RAM and 32 KiB program RAM limits of the ISPU. The TCN model proved to be too large to fit within the memory constraints of the ISPU, primarily due to its greater capacity in terms of number of parameters, designed for processing more complex signals like EMG. This benchmark aims to guide the development of efficient AI solutions for In-Sensor Machine Learning Computing, fostering innovation in the field of Edge AI benchmarking, such as the one conducted by the MLCommons-Tiny working group.https://www.mdpi.com/2078-2489/15/11/674edge artificial intelligencein-sensor machine learning computingdigital signal processingintelligent signal processing unittiny sensorsMLCommons-Tiny working group |
| spellingShingle | Fabrizio Maria Aymone Danilo Pietro Pau Benchmarking In-Sensor Machine Learning Computing: An Extension to the MLCommons-Tiny Suite Information edge artificial intelligence in-sensor machine learning computing digital signal processing intelligent signal processing unit tiny sensors MLCommons-Tiny working group |
| title | Benchmarking In-Sensor Machine Learning Computing: An Extension to the MLCommons-Tiny Suite |
| title_full | Benchmarking In-Sensor Machine Learning Computing: An Extension to the MLCommons-Tiny Suite |
| title_fullStr | Benchmarking In-Sensor Machine Learning Computing: An Extension to the MLCommons-Tiny Suite |
| title_full_unstemmed | Benchmarking In-Sensor Machine Learning Computing: An Extension to the MLCommons-Tiny Suite |
| title_short | Benchmarking In-Sensor Machine Learning Computing: An Extension to the MLCommons-Tiny Suite |
| title_sort | benchmarking in sensor machine learning computing an extension to the mlcommons tiny suite |
| topic | edge artificial intelligence in-sensor machine learning computing digital signal processing intelligent signal processing unit tiny sensors MLCommons-Tiny working group |
| url | https://www.mdpi.com/2078-2489/15/11/674 |
| work_keys_str_mv | AT fabriziomariaaymone benchmarkinginsensormachinelearningcomputinganextensiontothemlcommonstinysuite AT danilopietropau benchmarkinginsensormachinelearningcomputinganextensiontothemlcommonstinysuite |