Improving Image Quality and Controllability in Speed + Angular Velocity to Image Generation Through Synthetic Data for Driving Simulator Generation

With advancements in cross-modal techniques, methods for generating images and videos from text or speech have become increasingly practical. However, research on video generation from modalities other than text or speech remains limited. One major reason for this shortage is the lack of large-scale...

Full description

Saved in:

Bibliographic Details
Main Authors:	Yuto Imai, Tomoya Senda, Yusuke Kajiwara
Format:	Article
Language:	English
Published:	IEEE 2025-01-01
Series:	IEEE Access
Subjects:	Synthetic dataset speed + angular velocity to image generation cross-modal generation scene skip problem
Online Access:	https://ieeexplore.ieee.org/document/10824770/
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1832590313156247552
author	Yuto Imai Tomoya Senda Yusuke Kajiwara
author_facet	Yuto Imai Tomoya Senda Yusuke Kajiwara
author_sort	Yuto Imai
collection	DOAJ
description	With advancements in cross-modal techniques, methods for generating images and videos from text or speech have become increasingly practical. However, research on video generation from modalities other than text or speech remains limited. One major reason for this shortage is the lack of large-scale datasets, which leads to skewed data distributions and causes issues such as unresponsiveness to untrained patterns, image collapse, and scene skipping. To address these challenges, this study focuses on a driving simulation generation model, which produces driving scenarios from speed and steering angle information. We propose a synthetic dataset for speed + angular velocity to image (SAV2IMG) generation. Specifically, we create a half-synthetic dataset, in which only the query (control inputs) is synthetic and the response (images) is derived from real data, as well as a fully synthetic dataset, in which both the query and response are synthetic. By doing so, we construct a training environment that enables the model to handle diverse driving patterns and previously unseen control conditions. We conducted experiments comparing three training conditions for SAV2IMG: using real data only, using real plus half-synthetic data, and using real plus both half- and fully synthetic data. The results demonstrate improved image quality as measured by FID, enhanced controllability, and flexible adaptability to unknown control conditions. Moreover, employing fully synthetic data generated from 3D city models allowed for stable responses to unfamiliar scenarios. At the same time, we found that simple physical models failed to fully reproduce complex control patterns. These findings are not only valuable for improving SAV2IMG but also hold broader implications for vehicle-related generative tasks and cross-modal generation models in general. They provide a meaningful foundation for future model development and data augmentation strategies.
format	Article
id	doaj-art-24a39bb8b7064867b7467893002d659f
institution	Kabale University
issn	2169-3536
language	English
publishDate	2025-01-01
publisher	IEEE
record_format	Article
series	IEEE Access
spelling	doaj-art-24a39bb8b7064867b7467893002d659f2025-01-24T00:01:56ZengIEEEIEEE Access2169-35362025-01-0113121681217710.1109/ACCESS.2025.352571410824770Improving Image Quality and Controllability in Speed + Angular Velocity to Image Generation Through Synthetic Data for Driving Simulator GenerationYuto Imai0Tomoya Senda1Yusuke Kajiwara2https://orcid.org/0000-0001-5895-3312Division of Production System Science, Komatsu University, Komatsu, Ishikawa, JapanDivision of Production System Science, Komatsu University, Komatsu, Ishikawa, JapanDivision of Production System Science, Komatsu University, Komatsu, Ishikawa, JapanWith advancements in cross-modal techniques, methods for generating images and videos from text or speech have become increasingly practical. However, research on video generation from modalities other than text or speech remains limited. One major reason for this shortage is the lack of large-scale datasets, which leads to skewed data distributions and causes issues such as unresponsiveness to untrained patterns, image collapse, and scene skipping. To address these challenges, this study focuses on a driving simulation generation model, which produces driving scenarios from speed and steering angle information. We propose a synthetic dataset for speed + angular velocity to image (SAV2IMG) generation. Specifically, we create a half-synthetic dataset, in which only the query (control inputs) is synthetic and the response (images) is derived from real data, as well as a fully synthetic dataset, in which both the query and response are synthetic. By doing so, we construct a training environment that enables the model to handle diverse driving patterns and previously unseen control conditions. We conducted experiments comparing three training conditions for SAV2IMG: using real data only, using real plus half-synthetic data, and using real plus both half- and fully synthetic data. The results demonstrate improved image quality as measured by FID, enhanced controllability, and flexible adaptability to unknown control conditions. Moreover, employing fully synthetic data generated from 3D city models allowed for stable responses to unfamiliar scenarios. At the same time, we found that simple physical models failed to fully reproduce complex control patterns. These findings are not only valuable for improving SAV2IMG but also hold broader implications for vehicle-related generative tasks and cross-modal generation models in general. They provide a meaningful foundation for future model development and data augmentation strategies.https://ieeexplore.ieee.org/document/10824770/Synthetic datasetspeed + angular velocity to image generationcross-modal generationscene skip problem
spellingShingle	Yuto Imai Tomoya Senda Yusuke Kajiwara Improving Image Quality and Controllability in Speed + Angular Velocity to Image Generation Through Synthetic Data for Driving Simulator Generation IEEE Access Synthetic dataset speed + angular velocity to image generation cross-modal generation scene skip problem
title	Improving Image Quality and Controllability in Speed + Angular Velocity to Image Generation Through Synthetic Data for Driving Simulator Generation
title_full	Improving Image Quality and Controllability in Speed + Angular Velocity to Image Generation Through Synthetic Data for Driving Simulator Generation
title_fullStr	Improving Image Quality and Controllability in Speed + Angular Velocity to Image Generation Through Synthetic Data for Driving Simulator Generation
title_full_unstemmed	Improving Image Quality and Controllability in Speed + Angular Velocity to Image Generation Through Synthetic Data for Driving Simulator Generation
title_short	Improving Image Quality and Controllability in Speed + Angular Velocity to Image Generation Through Synthetic Data for Driving Simulator Generation
title_sort	improving image quality and controllability in speed angular velocity to image generation through synthetic data for driving simulator generation
topic	Synthetic dataset speed + angular velocity to image generation cross-modal generation scene skip problem
url	https://ieeexplore.ieee.org/document/10824770/
work_keys_str_mv	AT yutoimai improvingimagequalityandcontrollabilityinspeedangularvelocitytoimagegenerationthroughsyntheticdatafordrivingsimulatorgeneration AT tomoyasenda improvingimagequalityandcontrollabilityinspeedangularvelocitytoimagegenerationthroughsyntheticdatafordrivingsimulatorgeneration AT yusukekajiwara improvingimagequalityandcontrollabilityinspeedangularvelocitytoimagegenerationthroughsyntheticdatafordrivingsimulatorgeneration

Improving Image Quality and Controllability in Speed + Angular Velocity to Image Generation Through Synthetic Data for Driving Simulator Generation

Similar Items