AI-assisted exposure-response data analysis: Quantifying heterogeneous causal effects of exposures on survival times

AI-assisted data analysis can help risk analysts better understand exposure-response relationships by making it relatively easy to apply advanced statistical and machine learning methods, check their assumptions, and interpret their results. This paper demonstrates the potential of large language mo...

Full description

Saved in:
Bibliographic Details
Main Authors: Louis Anthony Cox, Jr., R. Jeffrey Lewis, Saumitra V. Rege, Shubham Singh
Format: Article
Language:English
Published: Elsevier 2025-06-01
Series:Global Epidemiology
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2590113324000452
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1841558860199886848
author Louis Anthony Cox, Jr.
R. Jeffrey Lewis
Saumitra V. Rege
Shubham Singh
author_facet Louis Anthony Cox, Jr.
R. Jeffrey Lewis
Saumitra V. Rege
Shubham Singh
author_sort Louis Anthony Cox, Jr.
collection DOAJ
description AI-assisted data analysis can help risk analysts better understand exposure-response relationships by making it relatively easy to apply advanced statistical and machine learning methods, check their assumptions, and interpret their results. This paper demonstrates the potential of large language models (LLMs), such as ChatGPT, to facilitate statistical analyses, including survival data analyses, for health risk assessments. Through AI-guided analyses using relatively recent and advanced methods such as Individual Conditional Expectation (ICE) plots using Random Survival Forests and Heterogeneous Treatment Effects (HTEs) estimated using Causal Survival Forests, population-level exposure-response functions can be disaggregated into individual-level exposure-response functions. These reveal the extent of heterogeneity in risks across individuals for different levels of exposure, holding other variables fixed. By applying these methods to an illustrative dataset on blood lead levels (BLL) and mortality risk among never-smoker men from the NHANES III survey, we show how AI can clarify inter-individual variations in exposure-associated risks. The results add insights not easily obtained from traditional parametric or semi-parametric models such as logistic regression and Cox proportional hazards models, illustrating the advantages of non-parametric approaches for quantifying heterogeneous causal effects on survival times. This paper also suggests some practical implications of using AI in regulatory health risk assessments and public policy decisions.
format Article
id doaj-art-af2663a2d2084d92a0e4f317216373e0
institution Kabale University
issn 2590-1133
language English
publishDate 2025-06-01
publisher Elsevier
record_format Article
series Global Epidemiology
spelling doaj-art-af2663a2d2084d92a0e4f317216373e02025-01-06T04:08:54ZengElsevierGlobal Epidemiology2590-11332025-06-019100179AI-assisted exposure-response data analysis: Quantifying heterogeneous causal effects of exposures on survival timesLouis Anthony Cox, Jr.0R. Jeffrey Lewis1Saumitra V. Rege2Shubham Singh3Cox Associates, Entanglement, and University of Colorado. 503 N. Franklin Street, Denver, Colorado, 80218, USA; Corresponding author.Kelly Services, Epidemiology Contractor (retired ExxonMobil Biomedical Sciences, Inc.), Lavallette, New Jersey, USAEpidemiology, ExxonMobil Biomedical Sciences, Inc.1545 U.S. Highway 22 East Annandale, NJ 08801-3059, USABusiness Analytics (BANA) Program, Business School, University of Colorado, 1475 Lawrence St. Denver, CO 80217-3364, USAAI-assisted data analysis can help risk analysts better understand exposure-response relationships by making it relatively easy to apply advanced statistical and machine learning methods, check their assumptions, and interpret their results. This paper demonstrates the potential of large language models (LLMs), such as ChatGPT, to facilitate statistical analyses, including survival data analyses, for health risk assessments. Through AI-guided analyses using relatively recent and advanced methods such as Individual Conditional Expectation (ICE) plots using Random Survival Forests and Heterogeneous Treatment Effects (HTEs) estimated using Causal Survival Forests, population-level exposure-response functions can be disaggregated into individual-level exposure-response functions. These reveal the extent of heterogeneity in risks across individuals for different levels of exposure, holding other variables fixed. By applying these methods to an illustrative dataset on blood lead levels (BLL) and mortality risk among never-smoker men from the NHANES III survey, we show how AI can clarify inter-individual variations in exposure-associated risks. The results add insights not easily obtained from traditional parametric or semi-parametric models such as logistic regression and Cox proportional hazards models, illustrating the advantages of non-parametric approaches for quantifying heterogeneous causal effects on survival times. This paper also suggests some practical implications of using AI in regulatory health risk assessments and public policy decisions.http://www.sciencedirect.com/science/article/pii/S2590113324000452AI-assisted data analysisICE plotsExposure-response modelingSurvival treesRandom survival ForestCausal Survival Forest
spellingShingle Louis Anthony Cox, Jr.
R. Jeffrey Lewis
Saumitra V. Rege
Shubham Singh
AI-assisted exposure-response data analysis: Quantifying heterogeneous causal effects of exposures on survival times
Global Epidemiology
AI-assisted data analysis
ICE plots
Exposure-response modeling
Survival trees
Random survival Forest
Causal Survival Forest
title AI-assisted exposure-response data analysis: Quantifying heterogeneous causal effects of exposures on survival times
title_full AI-assisted exposure-response data analysis: Quantifying heterogeneous causal effects of exposures on survival times
title_fullStr AI-assisted exposure-response data analysis: Quantifying heterogeneous causal effects of exposures on survival times
title_full_unstemmed AI-assisted exposure-response data analysis: Quantifying heterogeneous causal effects of exposures on survival times
title_short AI-assisted exposure-response data analysis: Quantifying heterogeneous causal effects of exposures on survival times
title_sort ai assisted exposure response data analysis quantifying heterogeneous causal effects of exposures on survival times
topic AI-assisted data analysis
ICE plots
Exposure-response modeling
Survival trees
Random survival Forest
Causal Survival Forest
url http://www.sciencedirect.com/science/article/pii/S2590113324000452
work_keys_str_mv AT louisanthonycoxjr aiassistedexposureresponsedataanalysisquantifyingheterogeneouscausaleffectsofexposuresonsurvivaltimes
AT rjeffreylewis aiassistedexposureresponsedataanalysisquantifyingheterogeneouscausaleffectsofexposuresonsurvivaltimes
AT saumitravrege aiassistedexposureresponsedataanalysisquantifyingheterogeneouscausaleffectsofexposuresonsurvivaltimes
AT shubhamsingh aiassistedexposureresponsedataanalysisquantifyingheterogeneouscausaleffectsofexposuresonsurvivaltimes