Zero-inflated models for the evaluation of colorectal polyps in colon cancer screening studies—a value-based biostatistics practice
Background Colon cancer screening studies are needed for the early detection of colorectal polyps to reduce the risk of colorectal cancer. Unfortunately, the data generated on colon polyps are typically analyzed in their dichotomized form and sometimes with standard count models, which leads to pote...
Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
PeerJ Inc.
2025-05-01
|
| Series: | PeerJ |
| Subjects: | |
| Online Access: | https://peerj.com/articles/19504.pdf |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849763201199112192 |
|---|---|
| author | Alok K. Dwivedi Sherif E. Elhanafi Mohamed O. Othman Marc J. Zuckerman |
| author_facet | Alok K. Dwivedi Sherif E. Elhanafi Mohamed O. Othman Marc J. Zuckerman |
| author_sort | Alok K. Dwivedi |
| collection | DOAJ |
| description | Background Colon cancer screening studies are needed for the early detection of colorectal polyps to reduce the risk of colorectal cancer. Unfortunately, the data generated on colon polyps are typically analyzed in their dichotomized form and sometimes with standard count models, which leads to potentially inaccurate findings in research studies. A more appropriate approach for evaluating colon polyps is zero-inflated models, considering undetected existing polyps at colonoscopy screening. Method We demonstrated the application of the zero-inflated and hurdle models including zero-inflated Poisson (ZIP), zero-inflated robust Poisson (ZIRP), zero-inflated negative binomial (ZINB), zero-inflated generalized Poisson (ZIGP), zero hurdle Poisson (ZHP), and zero hurdle negative binomial (ZHNB) models, and compared them with standard approaches including logistic regression (LR), Poisson regression (PR), robust Poisson (RP), and negative binomial (NB) regression for the evaluation of colorectal polyps using datasets from two randomized studies and one observational study. We also facilitated a step-by-step approach for selecting appropriate models for analyzing polyp data. Results All datasets yielded a significant amount of no polyps and therefore inflated or hurdle models performed best over single distribution models. We showed that cap-assisted colonoscopy yielded significantly more colon polyps (risk ratio [RR] = 1.38; 95% confidence interval [CI] [1.05–1.81]) compared with the standard colonoscopy by using the ZIP analysis. However, these findings were missed by standard analytic methods, including LR (odds ratio [OR] = 0.90; 95% CI [0.59–1.37]), PR (RR = 1.14; 95% CI [0.93–1.41]), and NB (RR = 1.16; 95% CI [0.89–1.51]) for evaluating colon polyps. The standard approaches, such as LR, PR, RP, or NB regressions for analyzing polyp data, produced potentially inaccurate findings compared to zero-inflated models in all example datasets. Furthermore, simulation studies also confirmed the superiority of ZIRP over alternative models in a range of datasets differing from the case studies. ZIRP was found to be the optimal method for analyzing polyp data in randomized studies, while the ZINB/ZHNB model showed a better fit in an observational study. Conclusion We suggest colonoscopy studies should jointly use the polyp detection rate and polyp counts as the quality measure. Based on theoretical, empirical, and simulation considerations, we encourage analysts to utilize zero-inflated models for evaluating colorectal polyps in colonoscopy screening studies for proper clinical interpretation of data and accurate reporting of findings. A similar approach can also be used for analyzing other types of polyp counts in colonoscopy studies. |
| format | Article |
| id | doaj-art-88375cc3f2fa45aeaed50eb68dcb79bd |
| institution | DOAJ |
| issn | 2167-8359 |
| language | English |
| publishDate | 2025-05-01 |
| publisher | PeerJ Inc. |
| record_format | Article |
| series | PeerJ |
| spelling | doaj-art-88375cc3f2fa45aeaed50eb68dcb79bd2025-08-20T03:05:29ZengPeerJ Inc.PeerJ2167-83592025-05-0113e1950410.7717/peerj.19504Zero-inflated models for the evaluation of colorectal polyps in colon cancer screening studies—a value-based biostatistics practiceAlok K. Dwivedi0Sherif E. Elhanafi1Mohamed O. Othman2Marc J. Zuckerman3Division of Biostatistics & Epidemiology, Department of Molecular and Translational Medicine, Paul L. Foster School of Medicine, Texas Tech University Health Science Center, El Paso, Texas, United StatesDivision of Gastroenterology, Department of Internal Medicine, Paul L. Foster School of Medicine, Texas Tech University Health Science Center, El Paso, Texas, United StatesGastroenterology and Hepatology Section, Baylor College of Medicine, Houston, Texas, United StatesDivision of Gastroenterology, Department of Internal Medicine, Paul L. Foster School of Medicine, Texas Tech University Health Science Center, El Paso, Texas, United StatesBackground Colon cancer screening studies are needed for the early detection of colorectal polyps to reduce the risk of colorectal cancer. Unfortunately, the data generated on colon polyps are typically analyzed in their dichotomized form and sometimes with standard count models, which leads to potentially inaccurate findings in research studies. A more appropriate approach for evaluating colon polyps is zero-inflated models, considering undetected existing polyps at colonoscopy screening. Method We demonstrated the application of the zero-inflated and hurdle models including zero-inflated Poisson (ZIP), zero-inflated robust Poisson (ZIRP), zero-inflated negative binomial (ZINB), zero-inflated generalized Poisson (ZIGP), zero hurdle Poisson (ZHP), and zero hurdle negative binomial (ZHNB) models, and compared them with standard approaches including logistic regression (LR), Poisson regression (PR), robust Poisson (RP), and negative binomial (NB) regression for the evaluation of colorectal polyps using datasets from two randomized studies and one observational study. We also facilitated a step-by-step approach for selecting appropriate models for analyzing polyp data. Results All datasets yielded a significant amount of no polyps and therefore inflated or hurdle models performed best over single distribution models. We showed that cap-assisted colonoscopy yielded significantly more colon polyps (risk ratio [RR] = 1.38; 95% confidence interval [CI] [1.05–1.81]) compared with the standard colonoscopy by using the ZIP analysis. However, these findings were missed by standard analytic methods, including LR (odds ratio [OR] = 0.90; 95% CI [0.59–1.37]), PR (RR = 1.14; 95% CI [0.93–1.41]), and NB (RR = 1.16; 95% CI [0.89–1.51]) for evaluating colon polyps. The standard approaches, such as LR, PR, RP, or NB regressions for analyzing polyp data, produced potentially inaccurate findings compared to zero-inflated models in all example datasets. Furthermore, simulation studies also confirmed the superiority of ZIRP over alternative models in a range of datasets differing from the case studies. ZIRP was found to be the optimal method for analyzing polyp data in randomized studies, while the ZINB/ZHNB model showed a better fit in an observational study. Conclusion We suggest colonoscopy studies should jointly use the polyp detection rate and polyp counts as the quality measure. Based on theoretical, empirical, and simulation considerations, we encourage analysts to utilize zero-inflated models for evaluating colorectal polyps in colonoscopy screening studies for proper clinical interpretation of data and accurate reporting of findings. A similar approach can also be used for analyzing other types of polyp counts in colonoscopy studies.https://peerj.com/articles/19504.pdfColonoscopy studiesPolypsCount dataZero-inflated modelsRegression analysis |
| spellingShingle | Alok K. Dwivedi Sherif E. Elhanafi Mohamed O. Othman Marc J. Zuckerman Zero-inflated models for the evaluation of colorectal polyps in colon cancer screening studies—a value-based biostatistics practice PeerJ Colonoscopy studies Polyps Count data Zero-inflated models Regression analysis |
| title | Zero-inflated models for the evaluation of colorectal polyps in colon cancer screening studies—a value-based biostatistics practice |
| title_full | Zero-inflated models for the evaluation of colorectal polyps in colon cancer screening studies—a value-based biostatistics practice |
| title_fullStr | Zero-inflated models for the evaluation of colorectal polyps in colon cancer screening studies—a value-based biostatistics practice |
| title_full_unstemmed | Zero-inflated models for the evaluation of colorectal polyps in colon cancer screening studies—a value-based biostatistics practice |
| title_short | Zero-inflated models for the evaluation of colorectal polyps in colon cancer screening studies—a value-based biostatistics practice |
| title_sort | zero inflated models for the evaluation of colorectal polyps in colon cancer screening studies a value based biostatistics practice |
| topic | Colonoscopy studies Polyps Count data Zero-inflated models Regression analysis |
| url | https://peerj.com/articles/19504.pdf |
| work_keys_str_mv | AT alokkdwivedi zeroinflatedmodelsfortheevaluationofcolorectalpolypsincoloncancerscreeningstudiesavaluebasedbiostatisticspractice AT sherifeelhanafi zeroinflatedmodelsfortheevaluationofcolorectalpolypsincoloncancerscreeningstudiesavaluebasedbiostatisticspractice AT mohamedoothman zeroinflatedmodelsfortheevaluationofcolorectalpolypsincoloncancerscreeningstudiesavaluebasedbiostatisticspractice AT marcjzuckerman zeroinflatedmodelsfortheevaluationofcolorectalpolypsincoloncancerscreeningstudiesavaluebasedbiostatisticspractice |