Enhancing imputation accuracy for catch-all missing data mechanisms with DFBETAS and leverage

This paper addresses the challenge of missing data in scientific research. It specifically examines the case of missing data arising from a “catch-all” missing not at ran (MNAR) mechanism, where missing values are disproportionately from one category, such as income or ethnicity in surveys. The stud...

Full description

Saved in:
Bibliographic Details
Main Authors: Fares Qeadan, William A. Barbeau
Format: Article
Language:English
Published: Taylor & Francis 2025-12-01
Series:Research in Statistics
Subjects:
Online Access:https://www.tandfonline.com/doi/10.1080/27684520.2025.2451682
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832542867045744640
author Fares Qeadan
William A. Barbeau
author_facet Fares Qeadan
William A. Barbeau
author_sort Fares Qeadan
collection DOAJ
description This paper addresses the challenge of missing data in scientific research. It specifically examines the case of missing data arising from a “catch-all” missing not at ran (MNAR) mechanism, where missing values are disproportionately from one category, such as income or ethnicity in surveys. The study introduces the use of the regression diagnostic DFBETAS along with Leverage to improve the imputation of categorical data under such conditions. DFBETAS, a measure of influence in regression, is adapted to capture the intrinsic information of missing values, thereby enhancing the imputation process within a Bayesian multiple imputation (MI) framework. We validate the proposed approach through Monte Carlo simulations with data generating mechanisms based on probability distributions. The results show that incorporating DFBETAS and Leverage significantly improves the accuracy of imputations, optimizes the balance between its sensitivity and specificity reduces bias, and enhances confidence interval coverage of imputed estimates, especially as the strength of the catch-all mechanism increases. The study demonstrates that MI with DFBETAS and Leverage outperforms standard MI methods, offering a robust solution for handling categorical data with catch-all MNAR mechanisms. This advancement in imputation methodology provides a more accurate and efficient means of dealing with missing data in various research fields.
format Article
id doaj-art-fb47d9ade7b643f894cb697ce22d0243
institution Kabale University
issn 2768-4520
language English
publishDate 2025-12-01
publisher Taylor & Francis
record_format Article
series Research in Statistics
spelling doaj-art-fb47d9ade7b643f894cb697ce22d02432025-02-03T15:33:30ZengTaylor & FrancisResearch in Statistics2768-45202025-12-013110.1080/27684520.2025.2451682Enhancing imputation accuracy for catch-all missing data mechanisms with DFBETAS and leverageFares Qeadan0William A. Barbeau1Parkinson School of Health Sciences and Public Health, Loyola University Chicago, Maywood, IL, USAParkinson School of Health Sciences and Public Health, Loyola University Chicago, Maywood, IL, USAThis paper addresses the challenge of missing data in scientific research. It specifically examines the case of missing data arising from a “catch-all” missing not at ran (MNAR) mechanism, where missing values are disproportionately from one category, such as income or ethnicity in surveys. The study introduces the use of the regression diagnostic DFBETAS along with Leverage to improve the imputation of categorical data under such conditions. DFBETAS, a measure of influence in regression, is adapted to capture the intrinsic information of missing values, thereby enhancing the imputation process within a Bayesian multiple imputation (MI) framework. We validate the proposed approach through Monte Carlo simulations with data generating mechanisms based on probability distributions. The results show that incorporating DFBETAS and Leverage significantly improves the accuracy of imputations, optimizes the balance between its sensitivity and specificity reduces bias, and enhances confidence interval coverage of imputed estimates, especially as the strength of the catch-all mechanism increases. The study demonstrates that MI with DFBETAS and Leverage outperforms standard MI methods, offering a robust solution for handling categorical data with catch-all MNAR mechanisms. This advancement in imputation methodology provides a more accurate and efficient means of dealing with missing data in various research fields.https://www.tandfonline.com/doi/10.1080/27684520.2025.2451682Missing datamultiple imputationDFBETAScatch-all mechanismMNAR (Missing Not at Random)
spellingShingle Fares Qeadan
William A. Barbeau
Enhancing imputation accuracy for catch-all missing data mechanisms with DFBETAS and leverage
Research in Statistics
Missing data
multiple imputation
DFBETAS
catch-all mechanism
MNAR (Missing Not at Random)
title Enhancing imputation accuracy for catch-all missing data mechanisms with DFBETAS and leverage
title_full Enhancing imputation accuracy for catch-all missing data mechanisms with DFBETAS and leverage
title_fullStr Enhancing imputation accuracy for catch-all missing data mechanisms with DFBETAS and leverage
title_full_unstemmed Enhancing imputation accuracy for catch-all missing data mechanisms with DFBETAS and leverage
title_short Enhancing imputation accuracy for catch-all missing data mechanisms with DFBETAS and leverage
title_sort enhancing imputation accuracy for catch all missing data mechanisms with dfbetas and leverage
topic Missing data
multiple imputation
DFBETAS
catch-all mechanism
MNAR (Missing Not at Random)
url https://www.tandfonline.com/doi/10.1080/27684520.2025.2451682
work_keys_str_mv AT faresqeadan enhancingimputationaccuracyforcatchallmissingdatamechanismswithdfbetasandleverage
AT williamabarbeau enhancingimputationaccuracyforcatchallmissingdatamechanismswithdfbetasandleverage