Combining readily available population and land cover maps to generate non-residential built-up labels to train Sentinel-2 image segmentation models

The localization of non-residential buildings over wide geographical areas is used as input within several contexts such as disaster management, regional and national planning, policy making and evaluation, among others. While the built-up environment has been continuously and globally mapped, given...

Full description

Saved in:
Bibliographic Details
Main Authors: Diogo Duarte, Cidália C. Fonte
Format: Article
Language:English
Published: Elsevier 2024-12-01
Series:International Journal of Applied Earth Observations and Geoinformation
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S1569843224006289
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850121859646881792
author Diogo Duarte
Cidália C. Fonte
author_facet Diogo Duarte
Cidália C. Fonte
author_sort Diogo Duarte
collection DOAJ
description The localization of non-residential buildings over wide geographical areas is used as input within several contexts such as disaster management, regional and national planning, policy making and evaluation, among others. While the built-up environment has been continuously and globally mapped, given the efforts on producing synoptic land cover information; little attention has been given to the land use component of such built-up. This is due to, for example, difficulties in distinguishing built-up land use in non-commercial satellite imagery (e.g., Sentinel-2, with spatial resolution of up to 10 m), difficulties in collecting training data for supervised classification approaches, and the fact that variations in features of the built-up environment not always translate to a specific land use. This is even more critical when considering nadir viewing satellite or aerial imagery. However, map producers have been addressing this issue. For example, the Copernicus program (European Commission), through their pan-European CORINE Land Cover (CLC), and Urban Atlas restricted to several European metropolitan areas, have been making available land use information of the built-up cover, with 6-year intervals. The Global Human Settlement Layer (Copernicus program) has been providing built-up land use information by distinguishing residential from non-residential built-up since 2023 (GHSL_NRES). Currently these are also provided with a time interval of 5 years. National map producers often provide this information but usually with an interval between editions of several years. In this paper we combine readily available population counts and land cover maps to generate non-residential training labels that can be used to train a Sentinel-2 image segmentation model capable of distinguishing non-residential built-up from the remaining built-up. Leveraging two publicly available datasets, population counts (WorldPop) and built-up land cover (ESA WorldCover), allowed to produce training data from which an image segmentation model was able to learn relevant features to distinguish non-residential areas from other built-up in Sentinel-2 images. The results within a study area of 4 Sentinel-2 tiles shown that it improves the detection of non-residential built-up areas when comparing with CLC and GHSL_NRES (F1-score of 32 %, 25 % and 29 %, respectively), which are the products providing pan-European information regarding the built-up land use. These results indicate that the combination of publicly available geospatial datasets may be used to produce higher quality geospatial information.
format Article
id doaj-art-eb38c4cd585340f996e4eb58ef480123
institution OA Journals
issn 1569-8432
language English
publishDate 2024-12-01
publisher Elsevier
record_format Article
series International Journal of Applied Earth Observations and Geoinformation
spelling doaj-art-eb38c4cd585340f996e4eb58ef4801232025-08-20T02:34:59ZengElsevierInternational Journal of Applied Earth Observations and Geoinformation1569-84322024-12-0113510427210.1016/j.jag.2024.104272Combining readily available population and land cover maps to generate non-residential built-up labels to train Sentinel-2 image segmentation modelsDiogo Duarte0Cidália C. Fonte1Institute for Systems Engineering and Computers at Coimbra (INESC Coimbra), Department of Electrical and Computer Engineering, Polo 2, 3030-290 Coimbra, Portugal; University of Coimbra, Department of Mathematics, Apartado 3008, EC Santa Cruz, 3001-501 Coimbra, Portugal; Corresponding author.Institute for Systems Engineering and Computers at Coimbra (INESC Coimbra), Department of Electrical and Computer Engineering, Polo 2, 3030-290 Coimbra, Portugal; University of Coimbra, Department of Mathematics, Apartado 3008, EC Santa Cruz, 3001-501 Coimbra, PortugalThe localization of non-residential buildings over wide geographical areas is used as input within several contexts such as disaster management, regional and national planning, policy making and evaluation, among others. While the built-up environment has been continuously and globally mapped, given the efforts on producing synoptic land cover information; little attention has been given to the land use component of such built-up. This is due to, for example, difficulties in distinguishing built-up land use in non-commercial satellite imagery (e.g., Sentinel-2, with spatial resolution of up to 10 m), difficulties in collecting training data for supervised classification approaches, and the fact that variations in features of the built-up environment not always translate to a specific land use. This is even more critical when considering nadir viewing satellite or aerial imagery. However, map producers have been addressing this issue. For example, the Copernicus program (European Commission), through their pan-European CORINE Land Cover (CLC), and Urban Atlas restricted to several European metropolitan areas, have been making available land use information of the built-up cover, with 6-year intervals. The Global Human Settlement Layer (Copernicus program) has been providing built-up land use information by distinguishing residential from non-residential built-up since 2023 (GHSL_NRES). Currently these are also provided with a time interval of 5 years. National map producers often provide this information but usually with an interval between editions of several years. In this paper we combine readily available population counts and land cover maps to generate non-residential training labels that can be used to train a Sentinel-2 image segmentation model capable of distinguishing non-residential built-up from the remaining built-up. Leveraging two publicly available datasets, population counts (WorldPop) and built-up land cover (ESA WorldCover), allowed to produce training data from which an image segmentation model was able to learn relevant features to distinguish non-residential areas from other built-up in Sentinel-2 images. The results within a study area of 4 Sentinel-2 tiles shown that it improves the detection of non-residential built-up areas when comparing with CLC and GHSL_NRES (F1-score of 32 %, 25 % and 29 %, respectively), which are the products providing pan-European information regarding the built-up land use. These results indicate that the combination of publicly available geospatial datasets may be used to produce higher quality geospatial information.http://www.sciencedirect.com/science/article/pii/S1569843224006289Land useArtificial surfacesImperviousConvolutional neural networksWorldPopESA WorldCover
spellingShingle Diogo Duarte
Cidália C. Fonte
Combining readily available population and land cover maps to generate non-residential built-up labels to train Sentinel-2 image segmentation models
International Journal of Applied Earth Observations and Geoinformation
Land use
Artificial surfaces
Impervious
Convolutional neural networks
WorldPop
ESA WorldCover
title Combining readily available population and land cover maps to generate non-residential built-up labels to train Sentinel-2 image segmentation models
title_full Combining readily available population and land cover maps to generate non-residential built-up labels to train Sentinel-2 image segmentation models
title_fullStr Combining readily available population and land cover maps to generate non-residential built-up labels to train Sentinel-2 image segmentation models
title_full_unstemmed Combining readily available population and land cover maps to generate non-residential built-up labels to train Sentinel-2 image segmentation models
title_short Combining readily available population and land cover maps to generate non-residential built-up labels to train Sentinel-2 image segmentation models
title_sort combining readily available population and land cover maps to generate non residential built up labels to train sentinel 2 image segmentation models
topic Land use
Artificial surfaces
Impervious
Convolutional neural networks
WorldPop
ESA WorldCover
url http://www.sciencedirect.com/science/article/pii/S1569843224006289
work_keys_str_mv AT diogoduarte combiningreadilyavailablepopulationandlandcovermapstogeneratenonresidentialbuiltuplabelstotrainsentinel2imagesegmentationmodels
AT cidaliacfonte combiningreadilyavailablepopulationandlandcovermapstogeneratenonresidentialbuiltuplabelstotrainsentinel2imagesegmentationmodels