Enhancing imputation accuracy for catch-all missing data mechanisms with DFBETAS and leverage
This paper addresses the challenge of missing data in scientific research. It specifically examines the case of missing data arising from a “catch-all” missing not at ran (MNAR) mechanism, where missing values are disproportionately from one category, such as income or ethnicity in surveys. The stud...
Saved in:
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
Taylor & Francis
2025-12-01
|
Series: | Research in Statistics |
Subjects: | |
Online Access: | https://www.tandfonline.com/doi/10.1080/27684520.2025.2451682 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1832542867045744640 |
---|---|
author | Fares Qeadan William A. Barbeau |
author_facet | Fares Qeadan William A. Barbeau |
author_sort | Fares Qeadan |
collection | DOAJ |
description | This paper addresses the challenge of missing data in scientific research. It specifically examines the case of missing data arising from a “catch-all” missing not at ran (MNAR) mechanism, where missing values are disproportionately from one category, such as income or ethnicity in surveys. The study introduces the use of the regression diagnostic DFBETAS along with Leverage to improve the imputation of categorical data under such conditions. DFBETAS, a measure of influence in regression, is adapted to capture the intrinsic information of missing values, thereby enhancing the imputation process within a Bayesian multiple imputation (MI) framework. We validate the proposed approach through Monte Carlo simulations with data generating mechanisms based on probability distributions. The results show that incorporating DFBETAS and Leverage significantly improves the accuracy of imputations, optimizes the balance between its sensitivity and specificity reduces bias, and enhances confidence interval coverage of imputed estimates, especially as the strength of the catch-all mechanism increases. The study demonstrates that MI with DFBETAS and Leverage outperforms standard MI methods, offering a robust solution for handling categorical data with catch-all MNAR mechanisms. This advancement in imputation methodology provides a more accurate and efficient means of dealing with missing data in various research fields. |
format | Article |
id | doaj-art-fb47d9ade7b643f894cb697ce22d0243 |
institution | Kabale University |
issn | 2768-4520 |
language | English |
publishDate | 2025-12-01 |
publisher | Taylor & Francis |
record_format | Article |
series | Research in Statistics |
spelling | doaj-art-fb47d9ade7b643f894cb697ce22d02432025-02-03T15:33:30ZengTaylor & FrancisResearch in Statistics2768-45202025-12-013110.1080/27684520.2025.2451682Enhancing imputation accuracy for catch-all missing data mechanisms with DFBETAS and leverageFares Qeadan0William A. Barbeau1Parkinson School of Health Sciences and Public Health, Loyola University Chicago, Maywood, IL, USAParkinson School of Health Sciences and Public Health, Loyola University Chicago, Maywood, IL, USAThis paper addresses the challenge of missing data in scientific research. It specifically examines the case of missing data arising from a “catch-all” missing not at ran (MNAR) mechanism, where missing values are disproportionately from one category, such as income or ethnicity in surveys. The study introduces the use of the regression diagnostic DFBETAS along with Leverage to improve the imputation of categorical data under such conditions. DFBETAS, a measure of influence in regression, is adapted to capture the intrinsic information of missing values, thereby enhancing the imputation process within a Bayesian multiple imputation (MI) framework. We validate the proposed approach through Monte Carlo simulations with data generating mechanisms based on probability distributions. The results show that incorporating DFBETAS and Leverage significantly improves the accuracy of imputations, optimizes the balance between its sensitivity and specificity reduces bias, and enhances confidence interval coverage of imputed estimates, especially as the strength of the catch-all mechanism increases. The study demonstrates that MI with DFBETAS and Leverage outperforms standard MI methods, offering a robust solution for handling categorical data with catch-all MNAR mechanisms. This advancement in imputation methodology provides a more accurate and efficient means of dealing with missing data in various research fields.https://www.tandfonline.com/doi/10.1080/27684520.2025.2451682Missing datamultiple imputationDFBETAScatch-all mechanismMNAR (Missing Not at Random) |
spellingShingle | Fares Qeadan William A. Barbeau Enhancing imputation accuracy for catch-all missing data mechanisms with DFBETAS and leverage Research in Statistics Missing data multiple imputation DFBETAS catch-all mechanism MNAR (Missing Not at Random) |
title | Enhancing imputation accuracy for catch-all missing data mechanisms with DFBETAS and leverage |
title_full | Enhancing imputation accuracy for catch-all missing data mechanisms with DFBETAS and leverage |
title_fullStr | Enhancing imputation accuracy for catch-all missing data mechanisms with DFBETAS and leverage |
title_full_unstemmed | Enhancing imputation accuracy for catch-all missing data mechanisms with DFBETAS and leverage |
title_short | Enhancing imputation accuracy for catch-all missing data mechanisms with DFBETAS and leverage |
title_sort | enhancing imputation accuracy for catch all missing data mechanisms with dfbetas and leverage |
topic | Missing data multiple imputation DFBETAS catch-all mechanism MNAR (Missing Not at Random) |
url | https://www.tandfonline.com/doi/10.1080/27684520.2025.2451682 |
work_keys_str_mv | AT faresqeadan enhancingimputationaccuracyforcatchallmissingdatamechanismswithdfbetasandleverage AT williamabarbeau enhancingimputationaccuracyforcatchallmissingdatamechanismswithdfbetasandleverage |