Random Reflectance: A New Hyperspectral Data Preprocessing Method for Improving the Accuracy of Machine Learning Algorithms

Hyperspectral plant phenotyping is a method that has a wide range of applications in various fields, including agriculture, forestry, food processing, medicine and plant breeding. It can be used to obtain a large amount of spectral and spatial information about an object. However, it is important to...

Full description

Saved in:
Bibliographic Details
Main Authors: Pavel A. Dmitriev, Anastasiya A. Dmitrieva, Boris L. Kozlovsky
Format: Article
Language:English
Published: MDPI AG 2025-03-01
Series:AgriEngineering
Subjects:
Online Access:https://www.mdpi.com/2624-7402/7/3/90
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849392757687189504
author Pavel A. Dmitriev
Anastasiya A. Dmitrieva
Boris L. Kozlovsky
author_facet Pavel A. Dmitriev
Anastasiya A. Dmitrieva
Boris L. Kozlovsky
author_sort Pavel A. Dmitriev
collection DOAJ
description Hyperspectral plant phenotyping is a method that has a wide range of applications in various fields, including agriculture, forestry, food processing, medicine and plant breeding. It can be used to obtain a large amount of spectral and spatial information about an object. However, it is important to acknowledge the inherent limitations of this approach, which include the presence of noise and the redundancy of information. The present study aims to assess a novel approach to hyperspectral data preprocessing, namely Random Reflectance (RR), for the classification of plant species. This study employs machine learning (ML) algorithms, specifically Random Forest (RF) and Gradient Boosting (GB), to analyse the performance of RR in comparison to Min–Max Normalisation (MMN) and Principal Component Analysis (PCA). The testing process was conducted on data derived from the proximal hyperspectral imaging (HSI) of leaves from three different maple species, which were sampled from trees at 7–10-day intervals between 2021 and 2024. The RF algorithm demonstrated a relative increase of 8.8% in the F1-score in 2021, 9.7% in 2022, 11.3% in 2023 and 11.8% in 2024. The GB algorithm exhibited a similar trend: 6.5% in 2021, 13.2% in 2022, 16.5% in 2023 and 17.4% in 2024. It has been demonstrated that hyperspectral data preprocessing with the MMN and PCA methods does not result in enhanced accuracy when classifying species using ML algorithms. The impact of preprocessing spectral profiles using the RR method may be associated with the observation that the synthesised set of spectral profiles exhibits a stronger reflection of the general parameters of spectral reflectance compared to the set of actual profiles. Subsequent research endeavours are anticipated to elucidate a mechanistic rationale for the RR method in conjunction with the RF and GB algorithms. Furthermore, the efficacy of this method will be evaluated through its application in deep machine learning algorithms.
format Article
id doaj-art-27bbdadd743949fb8b7443f8d6aaf072
institution Kabale University
issn 2624-7402
language English
publishDate 2025-03-01
publisher MDPI AG
record_format Article
series AgriEngineering
spelling doaj-art-27bbdadd743949fb8b7443f8d6aaf0722025-08-20T03:40:42ZengMDPI AGAgriEngineering2624-74022025-03-01739010.3390/agriengineering7030090Random Reflectance: A New Hyperspectral Data Preprocessing Method for Improving the Accuracy of Machine Learning AlgorithmsPavel A. Dmitriev0Anastasiya A. Dmitrieva1Boris L. Kozlovsky2Botanical Garden, Academy of Biology and Biotechnologies, Southern Federal University, Rostov-on-Don 344006, RussiaBotanical Garden, Academy of Biology and Biotechnologies, Southern Federal University, Rostov-on-Don 344006, RussiaBotanical Garden, Academy of Biology and Biotechnologies, Southern Federal University, Rostov-on-Don 344006, RussiaHyperspectral plant phenotyping is a method that has a wide range of applications in various fields, including agriculture, forestry, food processing, medicine and plant breeding. It can be used to obtain a large amount of spectral and spatial information about an object. However, it is important to acknowledge the inherent limitations of this approach, which include the presence of noise and the redundancy of information. The present study aims to assess a novel approach to hyperspectral data preprocessing, namely Random Reflectance (RR), for the classification of plant species. This study employs machine learning (ML) algorithms, specifically Random Forest (RF) and Gradient Boosting (GB), to analyse the performance of RR in comparison to Min–Max Normalisation (MMN) and Principal Component Analysis (PCA). The testing process was conducted on data derived from the proximal hyperspectral imaging (HSI) of leaves from three different maple species, which were sampled from trees at 7–10-day intervals between 2021 and 2024. The RF algorithm demonstrated a relative increase of 8.8% in the F1-score in 2021, 9.7% in 2022, 11.3% in 2023 and 11.8% in 2024. The GB algorithm exhibited a similar trend: 6.5% in 2021, 13.2% in 2022, 16.5% in 2023 and 17.4% in 2024. It has been demonstrated that hyperspectral data preprocessing with the MMN and PCA methods does not result in enhanced accuracy when classifying species using ML algorithms. The impact of preprocessing spectral profiles using the RR method may be associated with the observation that the synthesised set of spectral profiles exhibits a stronger reflection of the general parameters of spectral reflectance compared to the set of actual profiles. Subsequent research endeavours are anticipated to elucidate a mechanistic rationale for the RR method in conjunction with the RF and GB algorithms. Furthermore, the efficacy of this method will be evaluated through its application in deep machine learning algorithms.https://www.mdpi.com/2624-7402/7/3/90hyperspectral phenotypingproximal hyperspectral imagingspectral bandsynthetic hyperspectral profile<i>Acer</i>random forest
spellingShingle Pavel A. Dmitriev
Anastasiya A. Dmitrieva
Boris L. Kozlovsky
Random Reflectance: A New Hyperspectral Data Preprocessing Method for Improving the Accuracy of Machine Learning Algorithms
AgriEngineering
hyperspectral phenotyping
proximal hyperspectral imaging
spectral band
synthetic hyperspectral profile
<i>Acer</i>
random forest
title Random Reflectance: A New Hyperspectral Data Preprocessing Method for Improving the Accuracy of Machine Learning Algorithms
title_full Random Reflectance: A New Hyperspectral Data Preprocessing Method for Improving the Accuracy of Machine Learning Algorithms
title_fullStr Random Reflectance: A New Hyperspectral Data Preprocessing Method for Improving the Accuracy of Machine Learning Algorithms
title_full_unstemmed Random Reflectance: A New Hyperspectral Data Preprocessing Method for Improving the Accuracy of Machine Learning Algorithms
title_short Random Reflectance: A New Hyperspectral Data Preprocessing Method for Improving the Accuracy of Machine Learning Algorithms
title_sort random reflectance a new hyperspectral data preprocessing method for improving the accuracy of machine learning algorithms
topic hyperspectral phenotyping
proximal hyperspectral imaging
spectral band
synthetic hyperspectral profile
<i>Acer</i>
random forest
url https://www.mdpi.com/2624-7402/7/3/90
work_keys_str_mv AT paveladmitriev randomreflectanceanewhyperspectraldatapreprocessingmethodforimprovingtheaccuracyofmachinelearningalgorithms
AT anastasiyaadmitrieva randomreflectanceanewhyperspectraldatapreprocessingmethodforimprovingtheaccuracyofmachinelearningalgorithms
AT borislkozlovsky randomreflectanceanewhyperspectraldatapreprocessingmethodforimprovingtheaccuracyofmachinelearningalgorithms