StreetSurfaceVis: a dataset of crowdsourced street-level imagery annotated by road surface type and quality

Abstract Road unevenness significantly impacts the safety and comfort of traffic participants, especially vulnerable groups such as cyclists and wheelchair users. To train models for comprehensive road surface assessments, we introduce StreetSurfaceVis, a novel dataset comprising 9,122 street-level...

Full description

Saved in:

Bibliographic Details
Main Authors:	Alexandra Kapp, Edith Hoffmann, Esther Weigmann, Helena Mihaljević
Format:	Article
Language:	English
Published:	Nature Portfolio 2025-01-01
Series:	Scientific Data
Online Access:	https://doi.org/10.1038/s41597-024-04295-9
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1832594996386070528
author	Alexandra Kapp Edith Hoffmann Esther Weigmann Helena Mihaljević
author_facet	Alexandra Kapp Edith Hoffmann Esther Weigmann Helena Mihaljević
author_sort	Alexandra Kapp
collection	DOAJ
description	Abstract Road unevenness significantly impacts the safety and comfort of traffic participants, especially vulnerable groups such as cyclists and wheelchair users. To train models for comprehensive road surface assessments, we introduce StreetSurfaceVis, a novel dataset comprising 9,122 street-level images mostly from Germany collected from a crowdsourcing platform and manually annotated by road surface type and quality. By crafting a heterogeneous dataset, we aim to enable robust models that maintain high accuracy across diverse image sources. As the frequency distribution of road surface types and qualities is highly imbalanced, we propose a sampling strategy incorporating various external label prediction resources to ensure sufficient images per class while reducing manual annotation. More precisely, we estimate the impact of (1) enriching the image data with OpenStreetMap tags, (2) iterative training and application of a custom surface type classification model, (3) amplifying underrepresented classes through prompt-based classification with GPT-4o and (4) similarity search using image embeddings. Combining these strategies effectively reduces manual annotation workload while ensuring sufficient class representation.
format	Article
id	doaj-art-ad3e202874e84342ba98574b2101eb49
institution	Kabale University
issn	2052-4463
language	English
publishDate	2025-01-01
publisher	Nature Portfolio
record_format	Article
series	Scientific Data
spelling	doaj-art-ad3e202874e84342ba98574b2101eb492025-01-19T12:09:35ZengNature PortfolioScientific Data2052-44632025-01-0112111010.1038/s41597-024-04295-9StreetSurfaceVis: a dataset of crowdsourced street-level imagery annotated by road surface type and qualityAlexandra Kapp0Edith Hoffmann1Esther Weigmann2Helena Mihaljević3Hochschule für Technik und Wirtschaft Berlin (HTW Berlin)Hochschule für Technik und Wirtschaft Berlin (HTW Berlin)Hochschule für Technik und Wirtschaft Berlin (HTW Berlin)Hochschule für Technik und Wirtschaft Berlin (HTW Berlin)Abstract Road unevenness significantly impacts the safety and comfort of traffic participants, especially vulnerable groups such as cyclists and wheelchair users. To train models for comprehensive road surface assessments, we introduce StreetSurfaceVis, a novel dataset comprising 9,122 street-level images mostly from Germany collected from a crowdsourcing platform and manually annotated by road surface type and quality. By crafting a heterogeneous dataset, we aim to enable robust models that maintain high accuracy across diverse image sources. As the frequency distribution of road surface types and qualities is highly imbalanced, we propose a sampling strategy incorporating various external label prediction resources to ensure sufficient images per class while reducing manual annotation. More precisely, we estimate the impact of (1) enriching the image data with OpenStreetMap tags, (2) iterative training and application of a custom surface type classification model, (3) amplifying underrepresented classes through prompt-based classification with GPT-4o and (4) similarity search using image embeddings. Combining these strategies effectively reduces manual annotation workload while ensuring sufficient class representation.https://doi.org/10.1038/s41597-024-04295-9
spellingShingle	Alexandra Kapp Edith Hoffmann Esther Weigmann Helena Mihaljević StreetSurfaceVis: a dataset of crowdsourced street-level imagery annotated by road surface type and quality Scientific Data
title	StreetSurfaceVis: a dataset of crowdsourced street-level imagery annotated by road surface type and quality
title_full	StreetSurfaceVis: a dataset of crowdsourced street-level imagery annotated by road surface type and quality
title_fullStr	StreetSurfaceVis: a dataset of crowdsourced street-level imagery annotated by road surface type and quality
title_full_unstemmed	StreetSurfaceVis: a dataset of crowdsourced street-level imagery annotated by road surface type and quality
title_short	StreetSurfaceVis: a dataset of crowdsourced street-level imagery annotated by road surface type and quality
title_sort	streetsurfacevis a dataset of crowdsourced street level imagery annotated by road surface type and quality
url	https://doi.org/10.1038/s41597-024-04295-9
work_keys_str_mv	AT alexandrakapp streetsurfacevisadatasetofcrowdsourcedstreetlevelimageryannotatedbyroadsurfacetypeandquality AT edithhoffmann streetsurfacevisadatasetofcrowdsourcedstreetlevelimageryannotatedbyroadsurfacetypeandquality AT estherweigmann streetsurfacevisadatasetofcrowdsourcedstreetlevelimageryannotatedbyroadsurfacetypeandquality AT helenamihaljevic streetsurfacevisadatasetofcrowdsourcedstreetlevelimageryannotatedbyroadsurfacetypeandquality

StreetSurfaceVis: a dataset of crowdsourced street-level imagery annotated by road surface type and quality

Similar Items