Annotation-free Generation of Training Data Using Mixed Domains for Segmentation of 3D LiDAR Point Clouds

Semantic segmentation is important for robots navigating with 3D LiDARs, but the generation of training datasets requires tedious manual effort. In this paper, we introduce a set of strategies to efficiently generate large datasets by combining real and synthetic data samples. More specifically, the...

Full description

Saved in:

Bibliographic Details
Main Authors:	Cop Konrad, Sułek Bartosz, Trzciński Tomasz
Format:	Article
Language:	English
Published:	Sciendo 2025-09-01
Series:	Foundations of Computing and Decision Sciences
Subjects:	deep learning for visual perception semantic segmentation 3d li-dar robotic perception point clouds
Online Access:	https://doi.org/10.2478/fcds-2025-0013
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1849225044101693440
author	Cop Konrad Sułek Bartosz Trzciński Tomasz
author_facet	Cop Konrad Sułek Bartosz Trzciński Tomasz
author_sort	Cop Konrad
collection	DOAJ
description	Semantic segmentation is important for robots navigating with 3D LiDARs, but the generation of training datasets requires tedious manual effort. In this paper, we introduce a set of strategies to efficiently generate large datasets by combining real and synthetic data samples. More specifically, the method populates recorded empty scenes with navigation-relevant obstacles generated synthetically, thus combining two domains: real life and synthetic. Our approach requires no manual annotation, no detailed knowledge about actual data feature distribution, and no real-life data of objects of interest. We validate the proposed method in the underground parking scenario and compare it with available open-source datasets. The experiments show superiority to the off-the-shelf datasets containing similar data characteristics but also highlight the difficulty of achieving the level of manually annotated datasets. We also show that combining generated and annotated data improves the performance visibly, especially for cases with rare occurrences of objects of interest. Our solution is suitable for direct application in robotic systems.
format	Article
id	doaj-art-a586bcb6bcd34bd6b8db38f4e89b598a
institution	Kabale University
issn	2300-3405
language	English
publishDate	2025-09-01
publisher	Sciendo
record_format	Article
series	Foundations of Computing and Decision Sciences
spelling	doaj-art-a586bcb6bcd34bd6b8db38f4e89b598a2025-08-25T06:11:49ZengSciendoFoundations of Computing and Decision Sciences2300-34052025-09-0150334737110.2478/fcds-2025-0013Annotation-free Generation of Training Data Using Mixed Domains for Segmentation of 3D LiDAR Point CloudsCop Konrad0Sułek Bartosz1Trzciński Tomasz21Warsaw University of Technology, Faculty of Electronics and Information TechnologyNowowiejska 15/19, 00-665Warsaw, Poland2United Robots Sp. z o.o., Świeradowska 47, 02-622Warsaw, Poland1Warsaw University of Technology, Faculty of Electronics and Information TechnologyNowowiejska 15/19, 00-665Warsaw, PolandSemantic segmentation is important for robots navigating with 3D LiDARs, but the generation of training datasets requires tedious manual effort. In this paper, we introduce a set of strategies to efficiently generate large datasets by combining real and synthetic data samples. More specifically, the method populates recorded empty scenes with navigation-relevant obstacles generated synthetically, thus combining two domains: real life and synthetic. Our approach requires no manual annotation, no detailed knowledge about actual data feature distribution, and no real-life data of objects of interest. We validate the proposed method in the underground parking scenario and compare it with available open-source datasets. The experiments show superiority to the off-the-shelf datasets containing similar data characteristics but also highlight the difficulty of achieving the level of manually annotated datasets. We also show that combining generated and annotated data improves the performance visibly, especially for cases with rare occurrences of objects of interest. Our solution is suitable for direct application in robotic systems.https://doi.org/10.2478/fcds-2025-0013deep learning for visual perceptionsemantic segmentation3d li-darrobotic perceptionpoint clouds
spellingShingle	Cop Konrad Sułek Bartosz Trzciński Tomasz Annotation-free Generation of Training Data Using Mixed Domains for Segmentation of 3D LiDAR Point Clouds Foundations of Computing and Decision Sciences deep learning for visual perception semantic segmentation 3d li-dar robotic perception point clouds
title	Annotation-free Generation of Training Data Using Mixed Domains for Segmentation of 3D LiDAR Point Clouds
title_full	Annotation-free Generation of Training Data Using Mixed Domains for Segmentation of 3D LiDAR Point Clouds
title_fullStr	Annotation-free Generation of Training Data Using Mixed Domains for Segmentation of 3D LiDAR Point Clouds
title_full_unstemmed	Annotation-free Generation of Training Data Using Mixed Domains for Segmentation of 3D LiDAR Point Clouds
title_short	Annotation-free Generation of Training Data Using Mixed Domains for Segmentation of 3D LiDAR Point Clouds
title_sort	annotation free generation of training data using mixed domains for segmentation of 3d lidar point clouds
topic	deep learning for visual perception semantic segmentation 3d li-dar robotic perception point clouds
url	https://doi.org/10.2478/fcds-2025-0013
work_keys_str_mv	AT copkonrad annotationfreegenerationoftrainingdatausingmixeddomainsforsegmentationof3dlidarpointclouds AT sułekbartosz annotationfreegenerationoftrainingdatausingmixeddomainsforsegmentationof3dlidarpointclouds AT trzcinskitomasz annotationfreegenerationoftrainingdatausingmixeddomainsforsegmentationof3dlidarpointclouds

Annotation-free Generation of Training Data Using Mixed Domains for Segmentation of 3D LiDAR Point Clouds

Similar Items