The fallacy of single imputation for trait databases: Use multiple imputation instead

Abstract The past few years have seen the publication of many new trait databases. A common problem with large databases is a lack of completeness, or inversely, the high prevalence of missing values. Biologists have developed several methods to impute (fill in) missing values. This allows ordinary...

Full description

Saved in:
Bibliographic Details
Main Authors: Simone P. Blomberg, Orlin S. Todorov
Format: Article
Language:English
Published: Wiley 2025-04-01
Series:Methods in Ecology and Evolution
Subjects:
Online Access:https://doi.org/10.1111/2041-210X.14494
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850260558189690880
author Simone P. Blomberg
Orlin S. Todorov
author_facet Simone P. Blomberg
Orlin S. Todorov
author_sort Simone P. Blomberg
collection DOAJ
description Abstract The past few years have seen the publication of many new trait databases. A common problem with large databases is a lack of completeness, or inversely, the high prevalence of missing values. Biologists have developed several methods to impute (fill in) missing values. This allows ordinary statistical procedures to be used in analyses and the use of only complete cases, with a concomitant loss of power and accuracy, can be avoided. Often, biologists use simulation to test new methods by deleting values from a dataset and recording how well the imputed values match the known, removed values. Here we argue that this is a poor measure of the strength of an imputation method. We also describe the importance and logic of the statistical procedure of multiple imputation, which requires that the imputations need not be precise or accurate estimates of the missing data.
format Article
id doaj-art-a3a60aafff4b4acabf5ce19ac08b35c2
institution OA Journals
issn 2041-210X
language English
publishDate 2025-04-01
publisher Wiley
record_format Article
series Methods in Ecology and Evolution
spelling doaj-art-a3a60aafff4b4acabf5ce19ac08b35c22025-08-20T01:55:37ZengWileyMethods in Ecology and Evolution2041-210X2025-04-0116465866710.1111/2041-210X.14494The fallacy of single imputation for trait databases: Use multiple imputation insteadSimone P. Blomberg0Orlin S. Todorov1School of the Environment University of Queensland St. Lucia Queensland AustraliaTasmanian Institute of Agriculture University of Tasmania Launceston Tasmania AustraliaAbstract The past few years have seen the publication of many new trait databases. A common problem with large databases is a lack of completeness, or inversely, the high prevalence of missing values. Biologists have developed several methods to impute (fill in) missing values. This allows ordinary statistical procedures to be used in analyses and the use of only complete cases, with a concomitant loss of power and accuracy, can be avoided. Often, biologists use simulation to test new methods by deleting values from a dataset and recording how well the imputed values match the known, removed values. Here we argue that this is a poor measure of the strength of an imputation method. We also describe the importance and logic of the statistical procedure of multiple imputation, which requires that the imputations need not be precise or accurate estimates of the missing data.https://doi.org/10.1111/2041-210X.14494best practiseimputationMICEmissing dataRubin's rulestrait databases
spellingShingle Simone P. Blomberg
Orlin S. Todorov
The fallacy of single imputation for trait databases: Use multiple imputation instead
Methods in Ecology and Evolution
best practise
imputation
MICE
missing data
Rubin's rules
trait databases
title The fallacy of single imputation for trait databases: Use multiple imputation instead
title_full The fallacy of single imputation for trait databases: Use multiple imputation instead
title_fullStr The fallacy of single imputation for trait databases: Use multiple imputation instead
title_full_unstemmed The fallacy of single imputation for trait databases: Use multiple imputation instead
title_short The fallacy of single imputation for trait databases: Use multiple imputation instead
title_sort fallacy of single imputation for trait databases use multiple imputation instead
topic best practise
imputation
MICE
missing data
Rubin's rules
trait databases
url https://doi.org/10.1111/2041-210X.14494
work_keys_str_mv AT simonepblomberg thefallacyofsingleimputationfortraitdatabasesusemultipleimputationinstead
AT orlinstodorov thefallacyofsingleimputationfortraitdatabasesusemultipleimputationinstead
AT simonepblomberg fallacyofsingleimputationfortraitdatabasesusemultipleimputationinstead
AT orlinstodorov fallacyofsingleimputationfortraitdatabasesusemultipleimputationinstead