Optimizing Machine Learning Models with Data-level Approximate Computing: The Role of Diverse Sampling, Precision Scaling, Quantization and Feature Selection Strategies

Efficiency, low-power consumption, and real-time processing in embedded machine learning implementations are critical, particularly for models deployed in environments with large-scale data processing and resource-constrained environments. This paper investigates the application of approximate compu...

Full description

Saved in:

Bibliographic Details
Main Authors:	Ayad M. Dalloo, Amjad J. Humaidi
Format:	Article
Language:	English
Published:	Elsevier 2024-12-01
Series:	Results in Engineering
Subjects:	Machine Learning Approximate Computing Sampling, quantization and precision scaling Critical Applications
Online Access:	http://www.sciencedirect.com/science/article/pii/S2590123024017031
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1850249405212393472
author	Ayad M. Dalloo Amjad J. Humaidi
author_facet	Ayad M. Dalloo Amjad J. Humaidi
author_sort	Ayad M. Dalloo
collection	DOAJ
description	Efficiency, low-power consumption, and real-time processing in embedded machine learning implementations are critical, particularly for models deployed in environments with large-scale data processing and resource-constrained environments. This paper investigates the application of approximate computing techniques as a viable solution to reduce computational complexity and optimize machine learning models, focusing on two widely used supervised machine learning models: k-nearest neighbors (KNN) and support vector machines (SVM). Although many studies compare machine learning classification techniques, the combined use of optimization strategies remains underexplored. Specifically, the combined utilization of feature selection, sampling, quantization, precision scaling, and relaxation methods for the purpose of optimizing and acquiring training and validation data is underexplored, particularly within the context of medical diagnosis datasets. In this paper, we propose a framework that uses data-level approximate computing techniques, including by diverse sampling strategies, precision scaling, quantization, and feature selection methods, to evaluate the impact of these techniques on the computational efficiency and accuracy of KNN and SVM models. Experimental results demonstrate that with careful application of approximate computing strategies, especially in critical applications such as medical diagnosis, it is possible to achieve considerable gains in efficiency while maintaining acceptable levels of accuracy. The combined application of these methods by selecting 3 features and quantizing the data values to 8 levels, then applying random sampling with 30% reductions and scaling the precision at 5 bits resulted in reductions of 87.5% in computation, 76.9% in memory usage, and 17% in delay, without any degradation in accuracy, as validated by tenfold cross-validation, Training Data validation, and full dataset validation. This study confirms the potential of approximate computing to optimize machine learning workflows, making it particularly suitable for applications with limited computational resources. The source code is publicly available online https://github.com/AyadMDalloo/DatalvlAxC.
format	Article
id	doaj-art-b67ca900544e4ec3b0db0a12e90e201d
institution	OA Journals
issn	2590-1230
language	English
publishDate	2024-12-01
publisher	Elsevier
record_format	Article
series	Results in Engineering
spelling	doaj-art-b67ca900544e4ec3b0db0a12e90e201d2025-08-20T01:58:30ZengElsevierResults in Engineering2590-12302024-12-012410345110.1016/j.rineng.2024.103451Optimizing Machine Learning Models with Data-level Approximate Computing: The Role of Diverse Sampling, Precision Scaling, Quantization and Feature Selection StrategiesAyad M. Dalloo0Amjad J. Humaidi1Department of Communication Engineering, University of Technology, Baghdad, Iraq; Corresponding author.Control and Systems Engineering Department, University of Technology, Baghdad, IraqEfficiency, low-power consumption, and real-time processing in embedded machine learning implementations are critical, particularly for models deployed in environments with large-scale data processing and resource-constrained environments. This paper investigates the application of approximate computing techniques as a viable solution to reduce computational complexity and optimize machine learning models, focusing on two widely used supervised machine learning models: k-nearest neighbors (KNN) and support vector machines (SVM). Although many studies compare machine learning classification techniques, the combined use of optimization strategies remains underexplored. Specifically, the combined utilization of feature selection, sampling, quantization, precision scaling, and relaxation methods for the purpose of optimizing and acquiring training and validation data is underexplored, particularly within the context of medical diagnosis datasets. In this paper, we propose a framework that uses data-level approximate computing techniques, including by diverse sampling strategies, precision scaling, quantization, and feature selection methods, to evaluate the impact of these techniques on the computational efficiency and accuracy of KNN and SVM models. Experimental results demonstrate that with careful application of approximate computing strategies, especially in critical applications such as medical diagnosis, it is possible to achieve considerable gains in efficiency while maintaining acceptable levels of accuracy. The combined application of these methods by selecting 3 features and quantizing the data values to 8 levels, then applying random sampling with 30% reductions and scaling the precision at 5 bits resulted in reductions of 87.5% in computation, 76.9% in memory usage, and 17% in delay, without any degradation in accuracy, as validated by tenfold cross-validation, Training Data validation, and full dataset validation. This study confirms the potential of approximate computing to optimize machine learning workflows, making it particularly suitable for applications with limited computational resources. The source code is publicly available online https://github.com/AyadMDalloo/DatalvlAxC.http://www.sciencedirect.com/science/article/pii/S2590123024017031Machine LearningApproximate ComputingSampling, quantization and precision scalingCritical Applications
spellingShingle	Ayad M. Dalloo Amjad J. Humaidi Optimizing Machine Learning Models with Data-level Approximate Computing: The Role of Diverse Sampling, Precision Scaling, Quantization and Feature Selection Strategies Results in Engineering Machine Learning Approximate Computing Sampling, quantization and precision scaling Critical Applications
title	Optimizing Machine Learning Models with Data-level Approximate Computing: The Role of Diverse Sampling, Precision Scaling, Quantization and Feature Selection Strategies
title_full	Optimizing Machine Learning Models with Data-level Approximate Computing: The Role of Diverse Sampling, Precision Scaling, Quantization and Feature Selection Strategies
title_fullStr	Optimizing Machine Learning Models with Data-level Approximate Computing: The Role of Diverse Sampling, Precision Scaling, Quantization and Feature Selection Strategies
title_full_unstemmed	Optimizing Machine Learning Models with Data-level Approximate Computing: The Role of Diverse Sampling, Precision Scaling, Quantization and Feature Selection Strategies
title_short	Optimizing Machine Learning Models with Data-level Approximate Computing: The Role of Diverse Sampling, Precision Scaling, Quantization and Feature Selection Strategies
title_sort	optimizing machine learning models with data level approximate computing the role of diverse sampling precision scaling quantization and feature selection strategies
topic	Machine Learning Approximate Computing Sampling, quantization and precision scaling Critical Applications
url	http://www.sciencedirect.com/science/article/pii/S2590123024017031
work_keys_str_mv	AT ayadmdalloo optimizingmachinelearningmodelswithdatalevelapproximatecomputingtheroleofdiversesamplingprecisionscalingquantizationandfeatureselectionstrategies AT amjadjhumaidi optimizingmachinelearningmodelswithdatalevelapproximatecomputingtheroleofdiversesamplingprecisionscalingquantizationandfeatureselectionstrategies

Optimizing Machine Learning Models with Data-level Approximate Computing: The Role of Diverse Sampling, Precision Scaling, Quantization and Feature Selection Strategies

Similar Items