A DQN-Based Algorithm for Operational Optimization of Freight Trains in Long Steep Downhill Sections

Freight trains running in long steep downhill sections require speed regulation through cycle braking. However, improper braking application and release timing can pose significant safety risks in train operation. Taking SS6B electric locomotive pulling C80 freight car as the research object, the tr...

Full description

Saved in:
Bibliographic Details
Main Authors: HE Zhiyu, LI Yinan, LI Hui, JI Zhijun
Format: Article
Language:zho
Published: Editorial Office of Control and Information Technology 2024-08-01
Series:Kongzhi Yu Xinxi Jishu
Subjects:
Online Access:http://ctet.csrzic.com/thesisDetails#10.13889/j.issn.2096-5427.2024.04.003
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Freight trains running in long steep downhill sections require speed regulation through cycle braking. However, improper braking application and release timing can pose significant safety risks in train operation. Taking SS6B electric locomotive pulling C80 freight car as the research object, the train dynamics model based on mass belt is established. This study proposes a deep Q-network (DQN) based intelligent curve generation algorithm for operational optimization in these sections. This algorithm incorporates train operational efficiency, safety, and brake shoe wear as optimization objectives, and considers speed limits and charging time constraints for brake cylinders, enabling the search for optimal transition points in cycle braking conditions through interactions with the environment. The study employed the batch collection of training samples utilizing experience replay and a double-network mechanism, along with the preprocessing of neural network state inputs, and the investigation into feasible regions within the action space using a variable ε-greedy strategy. A loss function based on the value function was then constructed, and network parameters were updated iteratively by a batch gradient descent method. Results from simulations conducted in environments set up using Matlab showed that in the task training of train operation on long steep downhill with randomly generated entry speeds, cumulative rewards gradually converged over training runs, which verified the convergence and generalization of the proposed algorithm. The optimized operational curves generated with various entry speeds at the completion of training, effectively controlled the trains to apply air braking before reaching the speed limits and to release braking at the end of air charging, which verified the efficacy of the algorithm in ensuring the safety and efficiency of train operation. In addition, by comparing average cumulative reward curves for different learning rates and distribution ranges after preprocessing of different network inputs, the algorithm was further verified capable in improving convergence speeds and stability. The research results provide a reference for further optimizing the generation of operational curves for freight trains running in long steep downhill sections, thereby ensuring both train operational efficiency and safety.
ISSN:2096-5427