Efficient and accurate determination of the degree of substitution of cellulose acetate using ATR-FTIR spectroscopy and machine learning

Abstract Multiple linear regression models were trained to predict the degree of substitution (DS) of cellulose acetate based on raw infrared (IR) spectroscopic data. A repeated k-fold cross validation ensured unbiased assessment of model accuracy. Using the DS obtained from 1H NMR data as reference...

Full description

Saved in:
Bibliographic Details
Main Authors: Frank Rhein, Timo Sehn, Michael A. R. Meier
Format: Article
Language:English
Published: Nature Portfolio 2025-01-01
Series:Scientific Reports
Subjects:
Online Access:https://doi.org/10.1038/s41598-025-86378-0
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832585863997947904
author Frank Rhein
Timo Sehn
Michael A. R. Meier
author_facet Frank Rhein
Timo Sehn
Michael A. R. Meier
author_sort Frank Rhein
collection DOAJ
description Abstract Multiple linear regression models were trained to predict the degree of substitution (DS) of cellulose acetate based on raw infrared (IR) spectroscopic data. A repeated k-fold cross validation ensured unbiased assessment of model accuracy. Using the DS obtained from 1H NMR data as reference, the machine learning model achieved a mean absolute error (MAE) of 0.069 in DS on test data, demonstrating higher accuracy compared to the manual evaluation based on peak integration. Limiting the model to physically relevant areas unexpectedly showed the $${\hbox {C}{-}\hbox {H}}$$ peak to be the strongest predictor of DS. By applying a n-best feature selection algorithm based on the F-statistic of the Pearson correlation coefficient, several relevant areas were identified and the optimized model achieved an improved MAE of 0.052. Predicting the DS of other cellulose acetate data sets yielded similar accuracy, demonstrating that the developed models are robust and suitable for efficient and accurate routine evaluations. The model solely trained on cellulose acetate was further able to predict the DS of other cellulose esters with an accuracy of $$\approx 0.1-0.2$$ in DS and model architectures for a more general analysis of cellulose esters were proposed.
format Article
id doaj-art-a96c0b4b7d2243c79656a8d248f314e7
institution Kabale University
issn 2045-2322
language English
publishDate 2025-01-01
publisher Nature Portfolio
record_format Article
series Scientific Reports
spelling doaj-art-a96c0b4b7d2243c79656a8d248f314e72025-01-26T12:27:50ZengNature PortfolioScientific Reports2045-23222025-01-0115111110.1038/s41598-025-86378-0Efficient and accurate determination of the degree of substitution of cellulose acetate using ATR-FTIR spectroscopy and machine learningFrank Rhein0Timo Sehn1Michael A. R. Meier2Institute of Mechanical Process Engineering and Mechanics (MVM), Karlsruhe Institute of Technology (KIT)Institute of Biological and Chemical Systems – Functional Molecular Systems (IBCS-FMS), Karlsruhe Institute of Technology (KIT)Institute of Biological and Chemical Systems – Functional Molecular Systems (IBCS-FMS), Karlsruhe Institute of Technology (KIT)Abstract Multiple linear regression models were trained to predict the degree of substitution (DS) of cellulose acetate based on raw infrared (IR) spectroscopic data. A repeated k-fold cross validation ensured unbiased assessment of model accuracy. Using the DS obtained from 1H NMR data as reference, the machine learning model achieved a mean absolute error (MAE) of 0.069 in DS on test data, demonstrating higher accuracy compared to the manual evaluation based on peak integration. Limiting the model to physically relevant areas unexpectedly showed the $${\hbox {C}{-}\hbox {H}}$$ peak to be the strongest predictor of DS. By applying a n-best feature selection algorithm based on the F-statistic of the Pearson correlation coefficient, several relevant areas were identified and the optimized model achieved an improved MAE of 0.052. Predicting the DS of other cellulose acetate data sets yielded similar accuracy, demonstrating that the developed models are robust and suitable for efficient and accurate routine evaluations. The model solely trained on cellulose acetate was further able to predict the DS of other cellulose esters with an accuracy of $$\approx 0.1-0.2$$ in DS and model architectures for a more general analysis of cellulose esters were proposed.https://doi.org/10.1038/s41598-025-86378-0Machine learningDegree of substitutionInfrared spectroscopyCellulose esterCellulose acetate
spellingShingle Frank Rhein
Timo Sehn
Michael A. R. Meier
Efficient and accurate determination of the degree of substitution of cellulose acetate using ATR-FTIR spectroscopy and machine learning
Scientific Reports
Machine learning
Degree of substitution
Infrared spectroscopy
Cellulose ester
Cellulose acetate
title Efficient and accurate determination of the degree of substitution of cellulose acetate using ATR-FTIR spectroscopy and machine learning
title_full Efficient and accurate determination of the degree of substitution of cellulose acetate using ATR-FTIR spectroscopy and machine learning
title_fullStr Efficient and accurate determination of the degree of substitution of cellulose acetate using ATR-FTIR spectroscopy and machine learning
title_full_unstemmed Efficient and accurate determination of the degree of substitution of cellulose acetate using ATR-FTIR spectroscopy and machine learning
title_short Efficient and accurate determination of the degree of substitution of cellulose acetate using ATR-FTIR spectroscopy and machine learning
title_sort efficient and accurate determination of the degree of substitution of cellulose acetate using atr ftir spectroscopy and machine learning
topic Machine learning
Degree of substitution
Infrared spectroscopy
Cellulose ester
Cellulose acetate
url https://doi.org/10.1038/s41598-025-86378-0
work_keys_str_mv AT frankrhein efficientandaccuratedeterminationofthedegreeofsubstitutionofcelluloseacetateusingatrftirspectroscopyandmachinelearning
AT timosehn efficientandaccuratedeterminationofthedegreeofsubstitutionofcelluloseacetateusingatrftirspectroscopyandmachinelearning
AT michaelarmeier efficientandaccuratedeterminationofthedegreeofsubstitutionofcelluloseacetateusingatrftirspectroscopyandmachinelearning