A Reproducible Pipeline for Leveraging Operational Data Through Machine Learning in Digitally Emerging Urban Bus Fleets

The adoption of predictive maintenance in public transportation has gained increasing attention in the context of Industry 4.0. However, many urban bus fleets remain in early digital transformation stages, with limited historical data and fragmented infrastructures that hinder the implementation of...

Full description

Saved in:
Bibliographic Details
Main Authors: Bernardo Tormos, Vicente Bermudez, Ramón Sánchez-Márquez, Jorge Alvis
Format: Article
Language:English
Published: MDPI AG 2025-07-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/15/15/8395
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The adoption of predictive maintenance in public transportation has gained increasing attention in the context of Industry 4.0. However, many urban bus fleets remain in early digital transformation stages, with limited historical data and fragmented infrastructures that hinder the implementation of data-driven strategies. This study proposes a reproducible Machine Learning pipeline tailored to such data-scarce conditions, integrating domain-informed feature engineering, lightweight and interpretable models (Linear Regression, Ridge Regression, Decision Trees, KNN), SMOGN for imbalance handling, and Leave-One-Out Cross-Validation for robust evaluation. A scheduled batch retraining strategy is incorporated to adapt the model as new data becomes available. The pipeline is validated using real-world data from hybrid diesel buses, focusing on the prediction of time spent in critical soot accumulation zones of the Diesel Particulate Filter (DPF). In Zone 4, the model continued to outperform the baseline during the production test, indicating its validity for an additional operational period. In contrast, model performance in Zone 3 deteriorated over time, triggering retraining. These results confirm the pipeline’s ability to detect performance drift and support predictive maintenance decisions under evolving operational constraints. The proposed framework offers a scalable solution for digitally emerging fleets.
ISSN:2076-3417