Text-Conditioned Diffusion-Based Synthetic Data Generation for Turbine Engine Sensor Analysis and RUL Estimation
This paper introduces a novel framework for generating synthetic time-series data from turbine engine sensor readings using a text-conditioned diffusion model. The approach begins with dataset preprocessing, including correlation analysis, feature selection, and normalization. Principal Component An...
Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-04-01
|
| Series: | Machines |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2075-1702/13/5/374 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | This paper introduces a novel framework for generating synthetic time-series data from turbine engine sensor readings using a text-conditioned diffusion model. The approach begins with dataset preprocessing, including correlation analysis, feature selection, and normalization. Principal Component Analysis (PCA) transforms the normalized signals into three components, mapped to the RGB channels of an image. These components, combined with engine identifiers and cycle information, form compact 19 × 19 × 3 pixel images, later scaled to 512 × 512 × 3 pixels. A variational autoencoder (VAE)-based diffusion model, fine-tuned on these images, leverages text prompts describing engine characteristics to generate high-quality synthetic samples. A reverse transformation pipeline reconstructs synthetic images back into time-series signals, preserving the original engine-specific attributes while removing padding artifacts. The quality of the synthetic data is assessed by training Remaining Useful Life (RUL) estimation models and comparing performance across original, synthetic, and combined datasets. Results demonstrate that synthetic data can be beneficial for model training, particularly in the early epochs when working with limited datasets. Compared to existing approaches, which rely on generative adversarial networks (GANs) or deterministic transformations, the proposed framework offers enhanced data fidelity and adaptability. This study highlights the potential of text-conditioned diffusion models for augmenting time-series datasets in industrial Prognostics and Health Management (PHM) applications. |
|---|---|
| ISSN: | 2075-1702 |