The fallacy of single imputation for trait databases: Use multiple imputation instead
Abstract The past few years have seen the publication of many new trait databases. A common problem with large databases is a lack of completeness, or inversely, the high prevalence of missing values. Biologists have developed several methods to impute (fill in) missing values. This allows ordinary...
Saved in:
| Main Authors: | , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Wiley
2025-04-01
|
| Series: | Methods in Ecology and Evolution |
| Subjects: | |
| Online Access: | https://doi.org/10.1111/2041-210X.14494 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850260558189690880 |
|---|---|
| author | Simone P. Blomberg Orlin S. Todorov |
| author_facet | Simone P. Blomberg Orlin S. Todorov |
| author_sort | Simone P. Blomberg |
| collection | DOAJ |
| description | Abstract The past few years have seen the publication of many new trait databases. A common problem with large databases is a lack of completeness, or inversely, the high prevalence of missing values. Biologists have developed several methods to impute (fill in) missing values. This allows ordinary statistical procedures to be used in analyses and the use of only complete cases, with a concomitant loss of power and accuracy, can be avoided. Often, biologists use simulation to test new methods by deleting values from a dataset and recording how well the imputed values match the known, removed values. Here we argue that this is a poor measure of the strength of an imputation method. We also describe the importance and logic of the statistical procedure of multiple imputation, which requires that the imputations need not be precise or accurate estimates of the missing data. |
| format | Article |
| id | doaj-art-a3a60aafff4b4acabf5ce19ac08b35c2 |
| institution | OA Journals |
| issn | 2041-210X |
| language | English |
| publishDate | 2025-04-01 |
| publisher | Wiley |
| record_format | Article |
| series | Methods in Ecology and Evolution |
| spelling | doaj-art-a3a60aafff4b4acabf5ce19ac08b35c22025-08-20T01:55:37ZengWileyMethods in Ecology and Evolution2041-210X2025-04-0116465866710.1111/2041-210X.14494The fallacy of single imputation for trait databases: Use multiple imputation insteadSimone P. Blomberg0Orlin S. Todorov1School of the Environment University of Queensland St. Lucia Queensland AustraliaTasmanian Institute of Agriculture University of Tasmania Launceston Tasmania AustraliaAbstract The past few years have seen the publication of many new trait databases. A common problem with large databases is a lack of completeness, or inversely, the high prevalence of missing values. Biologists have developed several methods to impute (fill in) missing values. This allows ordinary statistical procedures to be used in analyses and the use of only complete cases, with a concomitant loss of power and accuracy, can be avoided. Often, biologists use simulation to test new methods by deleting values from a dataset and recording how well the imputed values match the known, removed values. Here we argue that this is a poor measure of the strength of an imputation method. We also describe the importance and logic of the statistical procedure of multiple imputation, which requires that the imputations need not be precise or accurate estimates of the missing data.https://doi.org/10.1111/2041-210X.14494best practiseimputationMICEmissing dataRubin's rulestrait databases |
| spellingShingle | Simone P. Blomberg Orlin S. Todorov The fallacy of single imputation for trait databases: Use multiple imputation instead Methods in Ecology and Evolution best practise imputation MICE missing data Rubin's rules trait databases |
| title | The fallacy of single imputation for trait databases: Use multiple imputation instead |
| title_full | The fallacy of single imputation for trait databases: Use multiple imputation instead |
| title_fullStr | The fallacy of single imputation for trait databases: Use multiple imputation instead |
| title_full_unstemmed | The fallacy of single imputation for trait databases: Use multiple imputation instead |
| title_short | The fallacy of single imputation for trait databases: Use multiple imputation instead |
| title_sort | fallacy of single imputation for trait databases use multiple imputation instead |
| topic | best practise imputation MICE missing data Rubin's rules trait databases |
| url | https://doi.org/10.1111/2041-210X.14494 |
| work_keys_str_mv | AT simonepblomberg thefallacyofsingleimputationfortraitdatabasesusemultipleimputationinstead AT orlinstodorov thefallacyofsingleimputationfortraitdatabasesusemultipleimputationinstead AT simonepblomberg fallacyofsingleimputationfortraitdatabasesusemultipleimputationinstead AT orlinstodorov fallacyofsingleimputationfortraitdatabasesusemultipleimputationinstead |