Explainable artificial intelligence driven insights into smoking prediction using machine learning and clinical parameters

Abstract Smoking is a leading cause of various health conditions, including cancer and respiratory diseases. Smokers often face medical restrictions such as limitations in blood and organ donation, reduced effectiveness of medications, and increased surgical complications. These impacts underscore t...

Full description

Saved in:
Bibliographic Details
Main Authors: S. Aishwarya, P. C. Siddalingaswamy, Krishnaraj Chadaga
Format: Article
Language:English
Published: Nature Portfolio 2025-07-01
Series:Scientific Reports
Subjects:
Online Access:https://doi.org/10.1038/s41598-025-09409-w
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849238746274201600
author S. Aishwarya
P. C. Siddalingaswamy
Krishnaraj Chadaga
author_facet S. Aishwarya
P. C. Siddalingaswamy
Krishnaraj Chadaga
author_sort S. Aishwarya
collection DOAJ
description Abstract Smoking is a leading cause of various health conditions, including cancer and respiratory diseases. Smokers often face medical restrictions such as limitations in blood and organ donation, reduced effectiveness of medications, and increased surgical complications. These impacts underscore the need for early detection of smoking status to enable timely intervention. This study explores the use of Artificial Intelligence (AI) and Machine Learning (ML) techniques to predict smoking status based on health parameters, including biosignals and clinical biomarkers. A balanced subset of 2,000 instances was sampled from a publicly available Kaggle dataset comprising clinical and biometric features. Multiple ML models were implemented, including Random Forest Classifier, Logistic Regression, Decision Tree Classifier, K-Nearest Neighbors, CatBoost Classifier, and an Artificial Neural Network. The Random Forest Classifier achieved the better performance with an accuracy of 0.80, precision of 0.80, recall of 0.80, and F1-score of 0.79. To enhance model interpretability, four Explainable Artificial Intelligence (XAI) techniques were applied: Shapley Additive Explanations (SHAP), Local Interpretable Model-Agnostic Explanations (LIME), QLattice, and Anchor. SHAP identified hemoglobin as the most influential predictor, while LIME, QLattice, and Anchor highlighted the role of gamma-glutamyl transferase (t). Interactions between hemoglobin, GTP, and height were associated with more accurate predictions. The integration of ensemble modeling and multiple XAI approaches offers deeper interpretability than prior studies, providing healthcare providers and policymakers with a robust, transparent decision-support tool for targeted intervention strategies.
format Article
id doaj-art-1b4ae6e15c974a3b87e8aa19f7febd58
institution Kabale University
issn 2045-2322
language English
publishDate 2025-07-01
publisher Nature Portfolio
record_format Article
series Scientific Reports
spelling doaj-art-1b4ae6e15c974a3b87e8aa19f7febd582025-08-20T04:01:25ZengNature PortfolioScientific Reports2045-23222025-07-0115112310.1038/s41598-025-09409-wExplainable artificial intelligence driven insights into smoking prediction using machine learning and clinical parametersS. Aishwarya0P. C. Siddalingaswamy1Krishnaraj Chadaga2Manipal Institute of Technology, Manipal Academy of Higher EducationManipal Institute of Technology, Manipal Academy of Higher EducationManipal Institute of Technology, Manipal Academy of Higher EducationAbstract Smoking is a leading cause of various health conditions, including cancer and respiratory diseases. Smokers often face medical restrictions such as limitations in blood and organ donation, reduced effectiveness of medications, and increased surgical complications. These impacts underscore the need for early detection of smoking status to enable timely intervention. This study explores the use of Artificial Intelligence (AI) and Machine Learning (ML) techniques to predict smoking status based on health parameters, including biosignals and clinical biomarkers. A balanced subset of 2,000 instances was sampled from a publicly available Kaggle dataset comprising clinical and biometric features. Multiple ML models were implemented, including Random Forest Classifier, Logistic Regression, Decision Tree Classifier, K-Nearest Neighbors, CatBoost Classifier, and an Artificial Neural Network. The Random Forest Classifier achieved the better performance with an accuracy of 0.80, precision of 0.80, recall of 0.80, and F1-score of 0.79. To enhance model interpretability, four Explainable Artificial Intelligence (XAI) techniques were applied: Shapley Additive Explanations (SHAP), Local Interpretable Model-Agnostic Explanations (LIME), QLattice, and Anchor. SHAP identified hemoglobin as the most influential predictor, while LIME, QLattice, and Anchor highlighted the role of gamma-glutamyl transferase (t). Interactions between hemoglobin, GTP, and height were associated with more accurate predictions. The integration of ensemble modeling and multiple XAI approaches offers deeper interpretability than prior studies, providing healthcare providers and policymakers with a robust, transparent decision-support tool for targeted intervention strategies.https://doi.org/10.1038/s41598-025-09409-wSmokers detectionMachine learningArtificial intelligenceXAIHealth parameters
spellingShingle S. Aishwarya
P. C. Siddalingaswamy
Krishnaraj Chadaga
Explainable artificial intelligence driven insights into smoking prediction using machine learning and clinical parameters
Scientific Reports
Smokers detection
Machine learning
Artificial intelligence
XAI
Health parameters
title Explainable artificial intelligence driven insights into smoking prediction using machine learning and clinical parameters
title_full Explainable artificial intelligence driven insights into smoking prediction using machine learning and clinical parameters
title_fullStr Explainable artificial intelligence driven insights into smoking prediction using machine learning and clinical parameters
title_full_unstemmed Explainable artificial intelligence driven insights into smoking prediction using machine learning and clinical parameters
title_short Explainable artificial intelligence driven insights into smoking prediction using machine learning and clinical parameters
title_sort explainable artificial intelligence driven insights into smoking prediction using machine learning and clinical parameters
topic Smokers detection
Machine learning
Artificial intelligence
XAI
Health parameters
url https://doi.org/10.1038/s41598-025-09409-w
work_keys_str_mv AT saishwarya explainableartificialintelligencedriveninsightsintosmokingpredictionusingmachinelearningandclinicalparameters
AT pcsiddalingaswamy explainableartificialintelligencedriveninsightsintosmokingpredictionusingmachinelearningandclinicalparameters
AT krishnarajchadaga explainableartificialintelligencedriveninsightsintosmokingpredictionusingmachinelearningandclinicalparameters