OpenForest: a data catalog for machine learning in forest monitoring

Forests play a crucial role in the Earth’s system processes and provide a suite of social and economic ecosystem services, but are significantly impacted by human activities, leading to a pronounced disruption of the equilibrium within ecosystems. Advancing forest monitoring worldwide offers advanta...

Full description

Saved in:
Bibliographic Details
Main Authors: Arthur Ouaknine, Teja Kattenborn, Etienne Laliberté, David Rolnick
Format: Article
Language:English
Published: Cambridge University Press 2025-01-01
Series:Environmental Data Science
Subjects:
Online Access:https://www.cambridge.org/core/product/identifier/S2634460224000530/type/journal_article
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849717049312411648
author Arthur Ouaknine
Teja Kattenborn
Etienne Laliberté
David Rolnick
author_facet Arthur Ouaknine
Teja Kattenborn
Etienne Laliberté
David Rolnick
author_sort Arthur Ouaknine
collection DOAJ
description Forests play a crucial role in the Earth’s system processes and provide a suite of social and economic ecosystem services, but are significantly impacted by human activities, leading to a pronounced disruption of the equilibrium within ecosystems. Advancing forest monitoring worldwide offers advantages in mitigating human impacts and enhancing our comprehension of forest composition, alongside the effects of climate change. While statistical modeling has traditionally found applications in forest biology, recent strides in machine learning and computer vision have reached important milestones using remote sensing data, such as tree species identification, tree crown segmentation, and forest biomass assessments. For this, the significance of open-access data remains essential in enhancing such data-driven algorithms and methodologies. Here, we provide a comprehensive and extensive overview of 86 open-access forest datasets across spatial scales, encompassing inventories, ground-based, aerial-based, satellite-based recordings, and country or world maps. These datasets are grouped in OpenForest, a dynamic catalog open to contributions that strives to reference all available open-access forest datasets. Moreover, in the context of these datasets, we aim to inspire research in machine learning applied to forest biology by establishing connections between contemporary topics, perspectives, and challenges inherent in both domains. We hope to encourage collaborations among scientists, fostering the sharing and exploration of diverse datasets through the application of machine learning methods for large-scale forest monitoring. OpenForest is available at the following url: https://github.com/RolnickLab/OpenForest.
format Article
id doaj-art-e697a4ece6ad405ab70abb07f9470f1e
institution DOAJ
issn 2634-4602
language English
publishDate 2025-01-01
publisher Cambridge University Press
record_format Article
series Environmental Data Science
spelling doaj-art-e697a4ece6ad405ab70abb07f9470f1e2025-08-20T03:12:47ZengCambridge University PressEnvironmental Data Science2634-46022025-01-01410.1017/eds.2024.53OpenForest: a data catalog for machine learning in forest monitoringArthur Ouaknine0https://orcid.org/0000-0003-1090-6204Teja Kattenborn1Etienne Laliberté2David Rolnick3School of Computer Science, McGill University, Montréal, QC, Canada Mila, Quebec AI Institute, Montréal, QC, CanadaRemote Sensing Centre for Earth System Research, Leipzig University, Leipzig, Germany German Centre for Integrative Biodiversity Research (iDiv), Halle-Jena-Leipzig, GermanyInstitut de recherche en biologie végétale, Département de sciences biologiques, Université de Montréal, Montréal, QC, CanadaSchool of Computer Science, McGill University, Montréal, QC, Canada Mila, Quebec AI Institute, Montréal, QC, CanadaForests play a crucial role in the Earth’s system processes and provide a suite of social and economic ecosystem services, but are significantly impacted by human activities, leading to a pronounced disruption of the equilibrium within ecosystems. Advancing forest monitoring worldwide offers advantages in mitigating human impacts and enhancing our comprehension of forest composition, alongside the effects of climate change. While statistical modeling has traditionally found applications in forest biology, recent strides in machine learning and computer vision have reached important milestones using remote sensing data, such as tree species identification, tree crown segmentation, and forest biomass assessments. For this, the significance of open-access data remains essential in enhancing such data-driven algorithms and methodologies. Here, we provide a comprehensive and extensive overview of 86 open-access forest datasets across spatial scales, encompassing inventories, ground-based, aerial-based, satellite-based recordings, and country or world maps. These datasets are grouped in OpenForest, a dynamic catalog open to contributions that strives to reference all available open-access forest datasets. Moreover, in the context of these datasets, we aim to inspire research in machine learning applied to forest biology by establishing connections between contemporary topics, perspectives, and challenges inherent in both domains. We hope to encourage collaborations among scientists, fostering the sharing and exploration of diverse datasets through the application of machine learning methods for large-scale forest monitoring. OpenForest is available at the following url: https://github.com/RolnickLab/OpenForest.https://www.cambridge.org/core/product/identifier/S2634460224000530/type/journal_articledatasetsforest monitoringmachine learningremote sensing
spellingShingle Arthur Ouaknine
Teja Kattenborn
Etienne Laliberté
David Rolnick
OpenForest: a data catalog for machine learning in forest monitoring
Environmental Data Science
datasets
forest monitoring
machine learning
remote sensing
title OpenForest: a data catalog for machine learning in forest monitoring
title_full OpenForest: a data catalog for machine learning in forest monitoring
title_fullStr OpenForest: a data catalog for machine learning in forest monitoring
title_full_unstemmed OpenForest: a data catalog for machine learning in forest monitoring
title_short OpenForest: a data catalog for machine learning in forest monitoring
title_sort openforest a data catalog for machine learning in forest monitoring
topic datasets
forest monitoring
machine learning
remote sensing
url https://www.cambridge.org/core/product/identifier/S2634460224000530/type/journal_article
work_keys_str_mv AT arthurouaknine openforestadatacatalogformachinelearninginforestmonitoring
AT tejakattenborn openforestadatacatalogformachinelearninginforestmonitoring
AT etiennelaliberte openforestadatacatalogformachinelearninginforestmonitoring
AT davidrolnick openforestadatacatalogformachinelearninginforestmonitoring