Self-Prioritizing Multi-Agent Reinforcement Learning for Conflict Resolution in Air Traffic Control with Limited Instructions

Air traffic control (ATC) relies on a series of complex tasks, the most crucial aspect of which is to assure safe separation between aircraft. Due to the increase in air traffic, decision support systems and safe and robust automation of ATC tasks are of high value. Automated conflict resolution has...

Full description

Saved in:
Bibliographic Details
Main Authors: Jens Nilsson, Jonas Unger, Gabriel Eilertsen
Format: Article
Language:English
Published: MDPI AG 2025-01-01
Series:Aerospace
Subjects:
Online Access:https://www.mdpi.com/2226-4310/12/2/88
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850081203856605184
author Jens Nilsson
Jonas Unger
Gabriel Eilertsen
author_facet Jens Nilsson
Jonas Unger
Gabriel Eilertsen
author_sort Jens Nilsson
collection DOAJ
description Air traffic control (ATC) relies on a series of complex tasks, the most crucial aspect of which is to assure safe separation between aircraft. Due to the increase in air traffic, decision support systems and safe and robust automation of ATC tasks are of high value. Automated conflict resolution has been an active area of research for decades, and in more recent years, reinforcement learning has been suggested as a powerful alternative to traditional algorithms. Reinforcement learning using discrete action spaces often require large action spaces to cover all combinations of actions, which can make them difficult to train. On the other hand, models with continuous action spaces require much lower dimensionality but often learn to solve conflicts by using a large number of exceedingly small actions. This makes them more suitable for decentralized ATC, such as in unmanned or free-flight airspace. In this paper, we present a novel multi-agent reinforcement learning method with a continuous action space that significantly reduces the number of actions by means of a learning-based priority mechanism. We demonstrate how this can keep the number of actions to a minimum while successfully resolving conflicts with little overhead in the distance required for the aircraft to reach their exit points. As such, the proposed solution is well-suited for centralized ATC, where the number of directives that can be transmitted to aircraft is limited.
format Article
id doaj-art-523903f4f6ed441fb6fcb539e2ab815f
institution DOAJ
issn 2226-4310
language English
publishDate 2025-01-01
publisher MDPI AG
record_format Article
series Aerospace
spelling doaj-art-523903f4f6ed441fb6fcb539e2ab815f2025-08-20T02:44:47ZengMDPI AGAerospace2226-43102025-01-011228810.3390/aerospace12020088Self-Prioritizing Multi-Agent Reinforcement Learning for Conflict Resolution in Air Traffic Control with Limited InstructionsJens Nilsson0Jonas Unger1Gabriel Eilertsen2Department of Science and Technology, Linköping University, 581 83 Linköping, SwedenDepartment of Science and Technology, Linköping University, 581 83 Linköping, SwedenDepartment of Science and Technology, Linköping University, 581 83 Linköping, SwedenAir traffic control (ATC) relies on a series of complex tasks, the most crucial aspect of which is to assure safe separation between aircraft. Due to the increase in air traffic, decision support systems and safe and robust automation of ATC tasks are of high value. Automated conflict resolution has been an active area of research for decades, and in more recent years, reinforcement learning has been suggested as a powerful alternative to traditional algorithms. Reinforcement learning using discrete action spaces often require large action spaces to cover all combinations of actions, which can make them difficult to train. On the other hand, models with continuous action spaces require much lower dimensionality but often learn to solve conflicts by using a large number of exceedingly small actions. This makes them more suitable for decentralized ATC, such as in unmanned or free-flight airspace. In this paper, we present a novel multi-agent reinforcement learning method with a continuous action space that significantly reduces the number of actions by means of a learning-based priority mechanism. We demonstrate how this can keep the number of actions to a minimum while successfully resolving conflicts with little overhead in the distance required for the aircraft to reach their exit points. As such, the proposed solution is well-suited for centralized ATC, where the number of directives that can be transmitted to aircraft is limited.https://www.mdpi.com/2226-4310/12/2/88air traffic controlconflict resolutionreinforcement learning
spellingShingle Jens Nilsson
Jonas Unger
Gabriel Eilertsen
Self-Prioritizing Multi-Agent Reinforcement Learning for Conflict Resolution in Air Traffic Control with Limited Instructions
Aerospace
air traffic control
conflict resolution
reinforcement learning
title Self-Prioritizing Multi-Agent Reinforcement Learning for Conflict Resolution in Air Traffic Control with Limited Instructions
title_full Self-Prioritizing Multi-Agent Reinforcement Learning for Conflict Resolution in Air Traffic Control with Limited Instructions
title_fullStr Self-Prioritizing Multi-Agent Reinforcement Learning for Conflict Resolution in Air Traffic Control with Limited Instructions
title_full_unstemmed Self-Prioritizing Multi-Agent Reinforcement Learning for Conflict Resolution in Air Traffic Control with Limited Instructions
title_short Self-Prioritizing Multi-Agent Reinforcement Learning for Conflict Resolution in Air Traffic Control with Limited Instructions
title_sort self prioritizing multi agent reinforcement learning for conflict resolution in air traffic control with limited instructions
topic air traffic control
conflict resolution
reinforcement learning
url https://www.mdpi.com/2226-4310/12/2/88
work_keys_str_mv AT jensnilsson selfprioritizingmultiagentreinforcementlearningforconflictresolutioninairtrafficcontrolwithlimitedinstructions
AT jonasunger selfprioritizingmultiagentreinforcementlearningforconflictresolutioninairtrafficcontrolwithlimitedinstructions
AT gabrieleilertsen selfprioritizingmultiagentreinforcementlearningforconflictresolutioninairtrafficcontrolwithlimitedinstructions