Optimizing Machine Learning Models with Data-level Approximate Computing: The Role of Diverse Sampling, Precision Scaling, Quantization and Feature Selection Strategies
Efficiency, low-power consumption, and real-time processing in embedded machine learning implementations are critical, particularly for models deployed in environments with large-scale data processing and resource-constrained environments. This paper investigates the application of approximate compu...
Saved in:
| Main Authors: | , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Elsevier
2024-12-01
|
| Series: | Results in Engineering |
| Subjects: | |
| Online Access: | http://www.sciencedirect.com/science/article/pii/S2590123024017031 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850249405212393472 |
|---|---|
| author | Ayad M. Dalloo Amjad J. Humaidi |
| author_facet | Ayad M. Dalloo Amjad J. Humaidi |
| author_sort | Ayad M. Dalloo |
| collection | DOAJ |
| description | Efficiency, low-power consumption, and real-time processing in embedded machine learning implementations are critical, particularly for models deployed in environments with large-scale data processing and resource-constrained environments. This paper investigates the application of approximate computing techniques as a viable solution to reduce computational complexity and optimize machine learning models, focusing on two widely used supervised machine learning models: k-nearest neighbors (KNN) and support vector machines (SVM). Although many studies compare machine learning classification techniques, the combined use of optimization strategies remains underexplored. Specifically, the combined utilization of feature selection, sampling, quantization, precision scaling, and relaxation methods for the purpose of optimizing and acquiring training and validation data is underexplored, particularly within the context of medical diagnosis datasets. In this paper, we propose a framework that uses data-level approximate computing techniques, including by diverse sampling strategies, precision scaling, quantization, and feature selection methods, to evaluate the impact of these techniques on the computational efficiency and accuracy of KNN and SVM models. Experimental results demonstrate that with careful application of approximate computing strategies, especially in critical applications such as medical diagnosis, it is possible to achieve considerable gains in efficiency while maintaining acceptable levels of accuracy. The combined application of these methods by selecting 3 features and quantizing the data values to 8 levels, then applying random sampling with 30% reductions and scaling the precision at 5 bits resulted in reductions of 87.5% in computation, 76.9% in memory usage, and 17% in delay, without any degradation in accuracy, as validated by tenfold cross-validation, Training Data validation, and full dataset validation. This study confirms the potential of approximate computing to optimize machine learning workflows, making it particularly suitable for applications with limited computational resources. The source code is publicly available online https://github.com/AyadMDalloo/DatalvlAxC. |
| format | Article |
| id | doaj-art-b67ca900544e4ec3b0db0a12e90e201d |
| institution | OA Journals |
| issn | 2590-1230 |
| language | English |
| publishDate | 2024-12-01 |
| publisher | Elsevier |
| record_format | Article |
| series | Results in Engineering |
| spelling | doaj-art-b67ca900544e4ec3b0db0a12e90e201d2025-08-20T01:58:30ZengElsevierResults in Engineering2590-12302024-12-012410345110.1016/j.rineng.2024.103451Optimizing Machine Learning Models with Data-level Approximate Computing: The Role of Diverse Sampling, Precision Scaling, Quantization and Feature Selection StrategiesAyad M. Dalloo0Amjad J. Humaidi1Department of Communication Engineering, University of Technology, Baghdad, Iraq; Corresponding author.Control and Systems Engineering Department, University of Technology, Baghdad, IraqEfficiency, low-power consumption, and real-time processing in embedded machine learning implementations are critical, particularly for models deployed in environments with large-scale data processing and resource-constrained environments. This paper investigates the application of approximate computing techniques as a viable solution to reduce computational complexity and optimize machine learning models, focusing on two widely used supervised machine learning models: k-nearest neighbors (KNN) and support vector machines (SVM). Although many studies compare machine learning classification techniques, the combined use of optimization strategies remains underexplored. Specifically, the combined utilization of feature selection, sampling, quantization, precision scaling, and relaxation methods for the purpose of optimizing and acquiring training and validation data is underexplored, particularly within the context of medical diagnosis datasets. In this paper, we propose a framework that uses data-level approximate computing techniques, including by diverse sampling strategies, precision scaling, quantization, and feature selection methods, to evaluate the impact of these techniques on the computational efficiency and accuracy of KNN and SVM models. Experimental results demonstrate that with careful application of approximate computing strategies, especially in critical applications such as medical diagnosis, it is possible to achieve considerable gains in efficiency while maintaining acceptable levels of accuracy. The combined application of these methods by selecting 3 features and quantizing the data values to 8 levels, then applying random sampling with 30% reductions and scaling the precision at 5 bits resulted in reductions of 87.5% in computation, 76.9% in memory usage, and 17% in delay, without any degradation in accuracy, as validated by tenfold cross-validation, Training Data validation, and full dataset validation. This study confirms the potential of approximate computing to optimize machine learning workflows, making it particularly suitable for applications with limited computational resources. The source code is publicly available online https://github.com/AyadMDalloo/DatalvlAxC.http://www.sciencedirect.com/science/article/pii/S2590123024017031Machine LearningApproximate ComputingSampling, quantization and precision scalingCritical Applications |
| spellingShingle | Ayad M. Dalloo Amjad J. Humaidi Optimizing Machine Learning Models with Data-level Approximate Computing: The Role of Diverse Sampling, Precision Scaling, Quantization and Feature Selection Strategies Results in Engineering Machine Learning Approximate Computing Sampling, quantization and precision scaling Critical Applications |
| title | Optimizing Machine Learning Models with Data-level Approximate Computing: The Role of Diverse Sampling, Precision Scaling, Quantization and Feature Selection Strategies |
| title_full | Optimizing Machine Learning Models with Data-level Approximate Computing: The Role of Diverse Sampling, Precision Scaling, Quantization and Feature Selection Strategies |
| title_fullStr | Optimizing Machine Learning Models with Data-level Approximate Computing: The Role of Diverse Sampling, Precision Scaling, Quantization and Feature Selection Strategies |
| title_full_unstemmed | Optimizing Machine Learning Models with Data-level Approximate Computing: The Role of Diverse Sampling, Precision Scaling, Quantization and Feature Selection Strategies |
| title_short | Optimizing Machine Learning Models with Data-level Approximate Computing: The Role of Diverse Sampling, Precision Scaling, Quantization and Feature Selection Strategies |
| title_sort | optimizing machine learning models with data level approximate computing the role of diverse sampling precision scaling quantization and feature selection strategies |
| topic | Machine Learning Approximate Computing Sampling, quantization and precision scaling Critical Applications |
| url | http://www.sciencedirect.com/science/article/pii/S2590123024017031 |
| work_keys_str_mv | AT ayadmdalloo optimizingmachinelearningmodelswithdatalevelapproximatecomputingtheroleofdiversesamplingprecisionscalingquantizationandfeatureselectionstrategies AT amjadjhumaidi optimizingmachinelearningmodelswithdatalevelapproximatecomputingtheroleofdiversesamplingprecisionscalingquantizationandfeatureselectionstrategies |