Using Reinforcement Learning in a Dynamic Team Orienteering Problem with Electric Batteries

This paper addresses the team orienteering problem (TOP) with vehicles equipped with electric batteries under dynamic travel conditions influenced by weather and traffic, which impact travel times between nodes and hence might have a critical effect on the battery capacity to cover the planned route...

Full description

Saved in:
Bibliographic Details
Main Authors: Majsa Ammouriova, Antoni Guerrero, Veronika Tsertsvadze, Christin Schumacher, Angel A. Juan
Format: Article
Language:English
Published: MDPI AG 2024-11-01
Series:Batteries
Subjects:
Online Access:https://www.mdpi.com/2313-0105/10/12/411
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850036348114698240
author Majsa Ammouriova
Antoni Guerrero
Veronika Tsertsvadze
Christin Schumacher
Angel A. Juan
author_facet Majsa Ammouriova
Antoni Guerrero
Veronika Tsertsvadze
Christin Schumacher
Angel A. Juan
author_sort Majsa Ammouriova
collection DOAJ
description This paper addresses the team orienteering problem (TOP) with vehicles equipped with electric batteries under dynamic travel conditions influenced by weather and traffic, which impact travel times between nodes and hence might have a critical effect on the battery capacity to cover the planned route. The study incorporates a novel approach for solving the dynamic TOP, comparing two solution methodologies: a merging heuristic and a reinforcement learning (RL) algorithm. The heuristic combines routes using calculated savings and a biased-randomized strategy, while the RL model leverages a transformer-based encoder–decoder architecture to sequentially construct solutions. We perform computational experiments on 50 problem instances, each subjected to 200 dynamic conditions, for a total of 10,000 problems solved. The results demonstrate that while the deterministic heuristic provides an upper bound for rewards, the RL model consistently yields robust solutions with lower variability under dynamic conditions. However, the dynamic heuristic, with a 20 s time limit for solving each instance, outperformed the RL model by <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>3.35</mn><mo>%</mo></mrow></semantics></math></inline-formula> on average. The study highlights the trade-offs between solution quality, computational resources, and time when dealing with dynamic environments in the TOP.
format Article
id doaj-art-ffe54b9cbca3404fb6b4ad7751e5ed93
institution DOAJ
issn 2313-0105
language English
publishDate 2024-11-01
publisher MDPI AG
record_format Article
series Batteries
spelling doaj-art-ffe54b9cbca3404fb6b4ad7751e5ed932025-08-20T02:57:12ZengMDPI AGBatteries2313-01052024-11-01101241110.3390/batteries10120411Using Reinforcement Learning in a Dynamic Team Orienteering Problem with Electric BatteriesMajsa Ammouriova0Antoni Guerrero1Veronika Tsertsvadze2Christin Schumacher3Angel A. Juan4Industrial Engineering Department, German Jordanian University, Amman 11180, JordanResearch Center on Production Management and Engineering CIGIP, Universitat Politècnica de València, Plz. Ferrandiz-Salvador, 03801 Alcoy, SpainResearch Center on Production Management and Engineering CIGIP, Universitat Politècnica de València, Plz. Ferrandiz-Salvador, 03801 Alcoy, SpainThe Department of Business and Economics, TU Dortmund University, 44221 Dortmund, GermanyResearch Center on Production Management and Engineering CIGIP, Universitat Politècnica de València, Plz. Ferrandiz-Salvador, 03801 Alcoy, SpainThis paper addresses the team orienteering problem (TOP) with vehicles equipped with electric batteries under dynamic travel conditions influenced by weather and traffic, which impact travel times between nodes and hence might have a critical effect on the battery capacity to cover the planned route. The study incorporates a novel approach for solving the dynamic TOP, comparing two solution methodologies: a merging heuristic and a reinforcement learning (RL) algorithm. The heuristic combines routes using calculated savings and a biased-randomized strategy, while the RL model leverages a transformer-based encoder–decoder architecture to sequentially construct solutions. We perform computational experiments on 50 problem instances, each subjected to 200 dynamic conditions, for a total of 10,000 problems solved. The results demonstrate that while the deterministic heuristic provides an upper bound for rewards, the RL model consistently yields robust solutions with lower variability under dynamic conditions. However, the dynamic heuristic, with a 20 s time limit for solving each instance, outperformed the RL model by <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>3.35</mn><mo>%</mo></mrow></semantics></math></inline-formula> on average. The study highlights the trade-offs between solution quality, computational resources, and time when dealing with dynamic environments in the TOP.https://www.mdpi.com/2313-0105/10/12/411team orienteering problembattery managementelectric vehiclereinforcement learning
spellingShingle Majsa Ammouriova
Antoni Guerrero
Veronika Tsertsvadze
Christin Schumacher
Angel A. Juan
Using Reinforcement Learning in a Dynamic Team Orienteering Problem with Electric Batteries
Batteries
team orienteering problem
battery management
electric vehicle
reinforcement learning
title Using Reinforcement Learning in a Dynamic Team Orienteering Problem with Electric Batteries
title_full Using Reinforcement Learning in a Dynamic Team Orienteering Problem with Electric Batteries
title_fullStr Using Reinforcement Learning in a Dynamic Team Orienteering Problem with Electric Batteries
title_full_unstemmed Using Reinforcement Learning in a Dynamic Team Orienteering Problem with Electric Batteries
title_short Using Reinforcement Learning in a Dynamic Team Orienteering Problem with Electric Batteries
title_sort using reinforcement learning in a dynamic team orienteering problem with electric batteries
topic team orienteering problem
battery management
electric vehicle
reinforcement learning
url https://www.mdpi.com/2313-0105/10/12/411
work_keys_str_mv AT majsaammouriova usingreinforcementlearninginadynamicteamorienteeringproblemwithelectricbatteries
AT antoniguerrero usingreinforcementlearninginadynamicteamorienteeringproblemwithelectricbatteries
AT veronikatsertsvadze usingreinforcementlearninginadynamicteamorienteeringproblemwithelectricbatteries
AT christinschumacher usingreinforcementlearninginadynamicteamorienteeringproblemwithelectricbatteries
AT angelajuan usingreinforcementlearninginadynamicteamorienteeringproblemwithelectricbatteries