Efficient and accurate determination of the degree of substitution of cellulose acetate using ATR-FTIR spectroscopy and machine learning
Abstract Multiple linear regression models were trained to predict the degree of substitution (DS) of cellulose acetate based on raw infrared (IR) spectroscopic data. A repeated k-fold cross validation ensured unbiased assessment of model accuracy. Using the DS obtained from 1H NMR data as reference...
Saved in:
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Nature Portfolio
2025-01-01
|
Series: | Scientific Reports |
Subjects: | |
Online Access: | https://doi.org/10.1038/s41598-025-86378-0 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1832585863997947904 |
---|---|
author | Frank Rhein Timo Sehn Michael A. R. Meier |
author_facet | Frank Rhein Timo Sehn Michael A. R. Meier |
author_sort | Frank Rhein |
collection | DOAJ |
description | Abstract Multiple linear regression models were trained to predict the degree of substitution (DS) of cellulose acetate based on raw infrared (IR) spectroscopic data. A repeated k-fold cross validation ensured unbiased assessment of model accuracy. Using the DS obtained from 1H NMR data as reference, the machine learning model achieved a mean absolute error (MAE) of 0.069 in DS on test data, demonstrating higher accuracy compared to the manual evaluation based on peak integration. Limiting the model to physically relevant areas unexpectedly showed the $${\hbox {C}{-}\hbox {H}}$$ peak to be the strongest predictor of DS. By applying a n-best feature selection algorithm based on the F-statistic of the Pearson correlation coefficient, several relevant areas were identified and the optimized model achieved an improved MAE of 0.052. Predicting the DS of other cellulose acetate data sets yielded similar accuracy, demonstrating that the developed models are robust and suitable for efficient and accurate routine evaluations. The model solely trained on cellulose acetate was further able to predict the DS of other cellulose esters with an accuracy of $$\approx 0.1-0.2$$ in DS and model architectures for a more general analysis of cellulose esters were proposed. |
format | Article |
id | doaj-art-a96c0b4b7d2243c79656a8d248f314e7 |
institution | Kabale University |
issn | 2045-2322 |
language | English |
publishDate | 2025-01-01 |
publisher | Nature Portfolio |
record_format | Article |
series | Scientific Reports |
spelling | doaj-art-a96c0b4b7d2243c79656a8d248f314e72025-01-26T12:27:50ZengNature PortfolioScientific Reports2045-23222025-01-0115111110.1038/s41598-025-86378-0Efficient and accurate determination of the degree of substitution of cellulose acetate using ATR-FTIR spectroscopy and machine learningFrank Rhein0Timo Sehn1Michael A. R. Meier2Institute of Mechanical Process Engineering and Mechanics (MVM), Karlsruhe Institute of Technology (KIT)Institute of Biological and Chemical Systems – Functional Molecular Systems (IBCS-FMS), Karlsruhe Institute of Technology (KIT)Institute of Biological and Chemical Systems – Functional Molecular Systems (IBCS-FMS), Karlsruhe Institute of Technology (KIT)Abstract Multiple linear regression models were trained to predict the degree of substitution (DS) of cellulose acetate based on raw infrared (IR) spectroscopic data. A repeated k-fold cross validation ensured unbiased assessment of model accuracy. Using the DS obtained from 1H NMR data as reference, the machine learning model achieved a mean absolute error (MAE) of 0.069 in DS on test data, demonstrating higher accuracy compared to the manual evaluation based on peak integration. Limiting the model to physically relevant areas unexpectedly showed the $${\hbox {C}{-}\hbox {H}}$$ peak to be the strongest predictor of DS. By applying a n-best feature selection algorithm based on the F-statistic of the Pearson correlation coefficient, several relevant areas were identified and the optimized model achieved an improved MAE of 0.052. Predicting the DS of other cellulose acetate data sets yielded similar accuracy, demonstrating that the developed models are robust and suitable for efficient and accurate routine evaluations. The model solely trained on cellulose acetate was further able to predict the DS of other cellulose esters with an accuracy of $$\approx 0.1-0.2$$ in DS and model architectures for a more general analysis of cellulose esters were proposed.https://doi.org/10.1038/s41598-025-86378-0Machine learningDegree of substitutionInfrared spectroscopyCellulose esterCellulose acetate |
spellingShingle | Frank Rhein Timo Sehn Michael A. R. Meier Efficient and accurate determination of the degree of substitution of cellulose acetate using ATR-FTIR spectroscopy and machine learning Scientific Reports Machine learning Degree of substitution Infrared spectroscopy Cellulose ester Cellulose acetate |
title | Efficient and accurate determination of the degree of substitution of cellulose acetate using ATR-FTIR spectroscopy and machine learning |
title_full | Efficient and accurate determination of the degree of substitution of cellulose acetate using ATR-FTIR spectroscopy and machine learning |
title_fullStr | Efficient and accurate determination of the degree of substitution of cellulose acetate using ATR-FTIR spectroscopy and machine learning |
title_full_unstemmed | Efficient and accurate determination of the degree of substitution of cellulose acetate using ATR-FTIR spectroscopy and machine learning |
title_short | Efficient and accurate determination of the degree of substitution of cellulose acetate using ATR-FTIR spectroscopy and machine learning |
title_sort | efficient and accurate determination of the degree of substitution of cellulose acetate using atr ftir spectroscopy and machine learning |
topic | Machine learning Degree of substitution Infrared spectroscopy Cellulose ester Cellulose acetate |
url | https://doi.org/10.1038/s41598-025-86378-0 |
work_keys_str_mv | AT frankrhein efficientandaccuratedeterminationofthedegreeofsubstitutionofcelluloseacetateusingatrftirspectroscopyandmachinelearning AT timosehn efficientandaccuratedeterminationofthedegreeofsubstitutionofcelluloseacetateusingatrftirspectroscopyandmachinelearning AT michaelarmeier efficientandaccuratedeterminationofthedegreeofsubstitutionofcelluloseacetateusingatrftirspectroscopyandmachinelearning |