Auditing Geospatial Datasets for Biases: Using Global Building Datasets for Disaster Risk Management

The presence of biases has been demonstrated in a wide range of machine learning applications; however, it is not yet widespread in the case of geospatial datasets. This study illustrates the importance of auditing geospatial datasets for biases, with a particular focus on disaster risk management a...

Full description

Saved in:
Bibliographic Details
Main Authors: Caroline M. Gevaert, Thomas Buunk, Marc J.C. van den Homberg
Format: Article
Language:English
Published: IEEE 2024-01-01
Series:IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10584113/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850040916361871360
author Caroline M. Gevaert
Thomas Buunk
Marc J.C. van den Homberg
author_facet Caroline M. Gevaert
Thomas Buunk
Marc J.C. van den Homberg
author_sort Caroline M. Gevaert
collection DOAJ
description The presence of biases has been demonstrated in a wide range of machine learning applications; however, it is not yet widespread in the case of geospatial datasets. This study illustrates the importance of auditing geospatial datasets for biases, with a particular focus on disaster risk management applications, as a lack of local data may direct humanitarian actors to utilize global building datasets to estimate damage and the distribution of aid efforts. It is important to ensure that there are no biases against the representation of vulnerable populations and that they are not missed in the distribution of aid. This manuscript audits four global building datasets [Google Open Buildings, Microsoft Bing Maps Building Footprints, Overture Maps Foundation (OMF), and OpenStreetMap (OSM)] for biases regarding the relative wealth index (RWI), population density, urban/rural proportions, and building size in Tanzania and the Philippines. The dataset accuracies for these two countries are lower than expected. Google Open Buildings (with a confidence above 0.7) and OSM demonstrated the best combinations of false negative and false discovery, though Google Open Buildings was more consistent across tiles. The equality of opportunity was lowest for the urban/rural proportions, whereas the OSM and OMF displayed particularly low equality of opportunity for population density and RWI in Tanzania. These results demonstrate that biases exist in these geospatial datasets. The types of biases are not consistent across the datasets and the two study areas, which emphasizes the importance of auditing these datasets for biases in new applications and study areas.
format Article
id doaj-art-d094b8cb5e544dc790015cc24af9d463
institution DOAJ
issn 1939-1404
2151-1535
language English
publishDate 2024-01-01
publisher IEEE
record_format Article
series IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
spelling doaj-art-d094b8cb5e544dc790015cc24af9d4632025-08-20T02:55:56ZengIEEEIEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing1939-14042151-15352024-01-0117125791259010.1109/JSTARS.2024.342250310584113Auditing Geospatial Datasets for Biases: Using Global Building Datasets for Disaster Risk ManagementCaroline M. Gevaert0https://orcid.org/0000-0002-3983-2459Thomas Buunk1https://orcid.org/0009-0002-6066-8734Marc J.C. van den Homberg2https://orcid.org/0000-0003-1436-254XFaculty ITC, University of Twente, Enschede, The Netherlands510 An Initiative of the Netherlands Red Cross, The Hague, The NetherlandsFaculty ITC, University of Twente, Enschede, The NetherlandsThe presence of biases has been demonstrated in a wide range of machine learning applications; however, it is not yet widespread in the case of geospatial datasets. This study illustrates the importance of auditing geospatial datasets for biases, with a particular focus on disaster risk management applications, as a lack of local data may direct humanitarian actors to utilize global building datasets to estimate damage and the distribution of aid efforts. It is important to ensure that there are no biases against the representation of vulnerable populations and that they are not missed in the distribution of aid. This manuscript audits four global building datasets [Google Open Buildings, Microsoft Bing Maps Building Footprints, Overture Maps Foundation (OMF), and OpenStreetMap (OSM)] for biases regarding the relative wealth index (RWI), population density, urban/rural proportions, and building size in Tanzania and the Philippines. The dataset accuracies for these two countries are lower than expected. Google Open Buildings (with a confidence above 0.7) and OSM demonstrated the best combinations of false negative and false discovery, though Google Open Buildings was more consistent across tiles. The equality of opportunity was lowest for the urban/rural proportions, whereas the OSM and OMF displayed particularly low equality of opportunity for population density and RWI in Tanzania. These results demonstrate that biases exist in these geospatial datasets. The types of biases are not consistent across the datasets and the two study areas, which emphasizes the importance of auditing these datasets for biases in new applications and study areas.https://ieeexplore.ieee.org/document/10584113/Biasbuilding detectionequityethicshumanitarian aidmachine learning
spellingShingle Caroline M. Gevaert
Thomas Buunk
Marc J.C. van den Homberg
Auditing Geospatial Datasets for Biases: Using Global Building Datasets for Disaster Risk Management
IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
Bias
building detection
equity
ethics
humanitarian aid
machine learning
title Auditing Geospatial Datasets for Biases: Using Global Building Datasets for Disaster Risk Management
title_full Auditing Geospatial Datasets for Biases: Using Global Building Datasets for Disaster Risk Management
title_fullStr Auditing Geospatial Datasets for Biases: Using Global Building Datasets for Disaster Risk Management
title_full_unstemmed Auditing Geospatial Datasets for Biases: Using Global Building Datasets for Disaster Risk Management
title_short Auditing Geospatial Datasets for Biases: Using Global Building Datasets for Disaster Risk Management
title_sort auditing geospatial datasets for biases using global building datasets for disaster risk management
topic Bias
building detection
equity
ethics
humanitarian aid
machine learning
url https://ieeexplore.ieee.org/document/10584113/
work_keys_str_mv AT carolinemgevaert auditinggeospatialdatasetsforbiasesusingglobalbuildingdatasetsfordisasterriskmanagement
AT thomasbuunk auditinggeospatialdatasetsforbiasesusingglobalbuildingdatasetsfordisasterriskmanagement
AT marcjcvandenhomberg auditinggeospatialdatasetsforbiasesusingglobalbuildingdatasetsfordisasterriskmanagement