Using Reinforcement Learning in a Dynamic Team Orienteering Problem with Electric Batteries
This paper addresses the team orienteering problem (TOP) with vehicles equipped with electric batteries under dynamic travel conditions influenced by weather and traffic, which impact travel times between nodes and hence might have a critical effect on the battery capacity to cover the planned route...
Saved in:
| Main Authors: | , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2024-11-01
|
| Series: | Batteries |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2313-0105/10/12/411 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850036348114698240 |
|---|---|
| author | Majsa Ammouriova Antoni Guerrero Veronika Tsertsvadze Christin Schumacher Angel A. Juan |
| author_facet | Majsa Ammouriova Antoni Guerrero Veronika Tsertsvadze Christin Schumacher Angel A. Juan |
| author_sort | Majsa Ammouriova |
| collection | DOAJ |
| description | This paper addresses the team orienteering problem (TOP) with vehicles equipped with electric batteries under dynamic travel conditions influenced by weather and traffic, which impact travel times between nodes and hence might have a critical effect on the battery capacity to cover the planned route. The study incorporates a novel approach for solving the dynamic TOP, comparing two solution methodologies: a merging heuristic and a reinforcement learning (RL) algorithm. The heuristic combines routes using calculated savings and a biased-randomized strategy, while the RL model leverages a transformer-based encoder–decoder architecture to sequentially construct solutions. We perform computational experiments on 50 problem instances, each subjected to 200 dynamic conditions, for a total of 10,000 problems solved. The results demonstrate that while the deterministic heuristic provides an upper bound for rewards, the RL model consistently yields robust solutions with lower variability under dynamic conditions. However, the dynamic heuristic, with a 20 s time limit for solving each instance, outperformed the RL model by <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>3.35</mn><mo>%</mo></mrow></semantics></math></inline-formula> on average. The study highlights the trade-offs between solution quality, computational resources, and time when dealing with dynamic environments in the TOP. |
| format | Article |
| id | doaj-art-ffe54b9cbca3404fb6b4ad7751e5ed93 |
| institution | DOAJ |
| issn | 2313-0105 |
| language | English |
| publishDate | 2024-11-01 |
| publisher | MDPI AG |
| record_format | Article |
| series | Batteries |
| spelling | doaj-art-ffe54b9cbca3404fb6b4ad7751e5ed932025-08-20T02:57:12ZengMDPI AGBatteries2313-01052024-11-01101241110.3390/batteries10120411Using Reinforcement Learning in a Dynamic Team Orienteering Problem with Electric BatteriesMajsa Ammouriova0Antoni Guerrero1Veronika Tsertsvadze2Christin Schumacher3Angel A. Juan4Industrial Engineering Department, German Jordanian University, Amman 11180, JordanResearch Center on Production Management and Engineering CIGIP, Universitat Politècnica de València, Plz. Ferrandiz-Salvador, 03801 Alcoy, SpainResearch Center on Production Management and Engineering CIGIP, Universitat Politècnica de València, Plz. Ferrandiz-Salvador, 03801 Alcoy, SpainThe Department of Business and Economics, TU Dortmund University, 44221 Dortmund, GermanyResearch Center on Production Management and Engineering CIGIP, Universitat Politècnica de València, Plz. Ferrandiz-Salvador, 03801 Alcoy, SpainThis paper addresses the team orienteering problem (TOP) with vehicles equipped with electric batteries under dynamic travel conditions influenced by weather and traffic, which impact travel times between nodes and hence might have a critical effect on the battery capacity to cover the planned route. The study incorporates a novel approach for solving the dynamic TOP, comparing two solution methodologies: a merging heuristic and a reinforcement learning (RL) algorithm. The heuristic combines routes using calculated savings and a biased-randomized strategy, while the RL model leverages a transformer-based encoder–decoder architecture to sequentially construct solutions. We perform computational experiments on 50 problem instances, each subjected to 200 dynamic conditions, for a total of 10,000 problems solved. The results demonstrate that while the deterministic heuristic provides an upper bound for rewards, the RL model consistently yields robust solutions with lower variability under dynamic conditions. However, the dynamic heuristic, with a 20 s time limit for solving each instance, outperformed the RL model by <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>3.35</mn><mo>%</mo></mrow></semantics></math></inline-formula> on average. The study highlights the trade-offs between solution quality, computational resources, and time when dealing with dynamic environments in the TOP.https://www.mdpi.com/2313-0105/10/12/411team orienteering problembattery managementelectric vehiclereinforcement learning |
| spellingShingle | Majsa Ammouriova Antoni Guerrero Veronika Tsertsvadze Christin Schumacher Angel A. Juan Using Reinforcement Learning in a Dynamic Team Orienteering Problem with Electric Batteries Batteries team orienteering problem battery management electric vehicle reinforcement learning |
| title | Using Reinforcement Learning in a Dynamic Team Orienteering Problem with Electric Batteries |
| title_full | Using Reinforcement Learning in a Dynamic Team Orienteering Problem with Electric Batteries |
| title_fullStr | Using Reinforcement Learning in a Dynamic Team Orienteering Problem with Electric Batteries |
| title_full_unstemmed | Using Reinforcement Learning in a Dynamic Team Orienteering Problem with Electric Batteries |
| title_short | Using Reinforcement Learning in a Dynamic Team Orienteering Problem with Electric Batteries |
| title_sort | using reinforcement learning in a dynamic team orienteering problem with electric batteries |
| topic | team orienteering problem battery management electric vehicle reinforcement learning |
| url | https://www.mdpi.com/2313-0105/10/12/411 |
| work_keys_str_mv | AT majsaammouriova usingreinforcementlearninginadynamicteamorienteeringproblemwithelectricbatteries AT antoniguerrero usingreinforcementlearninginadynamicteamorienteeringproblemwithelectricbatteries AT veronikatsertsvadze usingreinforcementlearninginadynamicteamorienteeringproblemwithelectricbatteries AT christinschumacher usingreinforcementlearninginadynamicteamorienteeringproblemwithelectricbatteries AT angelajuan usingreinforcementlearninginadynamicteamorienteeringproblemwithelectricbatteries |