Using Large Language Models for Aerospace Code Generation: Methods, Benchmarks, and Potential Values

In recent years, Large Language Models (LLMs) have witnessed rapid advancements, revolutionizing various domains. Within the realm of software development, code generation technology powered by LLMs has emerged as a prominent research focus. Despite its potential, the application of this technology...

Full description

Saved in:
Bibliographic Details
Main Authors: Rui He, Liang Zhang, Mengyao Lyu, Liangqing Lyu, Changbin Xue
Format: Article
Language:English
Published: MDPI AG 2025-05-01
Series:Aerospace
Subjects:
Online Access:https://www.mdpi.com/2226-4310/12/6/498
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849435304937652224
author Rui He
Liang Zhang
Mengyao Lyu
Liangqing Lyu
Changbin Xue
author_facet Rui He
Liang Zhang
Mengyao Lyu
Liangqing Lyu
Changbin Xue
author_sort Rui He
collection DOAJ
description In recent years, Large Language Models (LLMs) have witnessed rapid advancements, revolutionizing various domains. Within the realm of software development, code generation technology powered by LLMs has emerged as a prominent research focus. Despite its potential, the application of this technology in the aerospace sector remains in its nascent, exploratory phase. This paper delves into the intricacies of LLM-based code generation methods and explores their potential applications in aerospace contexts. It introduces RepoSpace, the pioneering warehouse-level benchmark test for code generation of spaceborne equipment. Comprising 825 samples from five actual projects, this benchmark offers a more precise evaluation of LLMs’ capabilities in aerospace scenarios. Through extensive evaluations of seven state-of-the-art LLMs on RepoSpace, the study reveals that domain-specific differences significantly impact the code generation performance of LLMs. Existing LLMs exhibit subpar performance in specialized warehouse-level code generation tasks for aerospace, with their performance markedly lower than that of domain tasks. The research further demonstrates that Retrieval Augmented Generation (RAG) technology can effectively enhance LLMs’ code generation capabilities. Additionally, the use of appropriate prompt templates can guide the models to achieve superior results. Moreover, high-quality documentation strings are found to be crucial in improving LLMs’ performance in warehouse-level code generation tasks. This study provides a vital reference for leveraging LLMs for code generation in the aerospace field, thereby fostering technological innovation and progress in this critical domain.
format Article
id doaj-art-7b2df165985d461fad3d868ab8f3c50d
institution Kabale University
issn 2226-4310
language English
publishDate 2025-05-01
publisher MDPI AG
record_format Article
series Aerospace
spelling doaj-art-7b2df165985d461fad3d868ab8f3c50d2025-08-20T03:26:20ZengMDPI AGAerospace2226-43102025-05-0112649810.3390/aerospace12060498Using Large Language Models for Aerospace Code Generation: Methods, Benchmarks, and Potential ValuesRui He0Liang Zhang1Mengyao Lyu2Liangqing Lyu3Changbin Xue4Key Laboratory of Electronics and Information Technology for Space System, National Space Science Center, Chinese Academy of Sciences, Beijing 100190, ChinaKey Laboratory of Electronics and Information Technology for Space System, National Space Science Center, Chinese Academy of Sciences, Beijing 100190, ChinaKey Laboratory of Electronics and Information Technology for Space System, National Space Science Center, Chinese Academy of Sciences, Beijing 100190, ChinaKey Laboratory of Electronics and Information Technology for Space System, National Space Science Center, Chinese Academy of Sciences, Beijing 100190, ChinaKey Laboratory of Electronics and Information Technology for Space System, National Space Science Center, Chinese Academy of Sciences, Beijing 100190, ChinaIn recent years, Large Language Models (LLMs) have witnessed rapid advancements, revolutionizing various domains. Within the realm of software development, code generation technology powered by LLMs has emerged as a prominent research focus. Despite its potential, the application of this technology in the aerospace sector remains in its nascent, exploratory phase. This paper delves into the intricacies of LLM-based code generation methods and explores their potential applications in aerospace contexts. It introduces RepoSpace, the pioneering warehouse-level benchmark test for code generation of spaceborne equipment. Comprising 825 samples from five actual projects, this benchmark offers a more precise evaluation of LLMs’ capabilities in aerospace scenarios. Through extensive evaluations of seven state-of-the-art LLMs on RepoSpace, the study reveals that domain-specific differences significantly impact the code generation performance of LLMs. Existing LLMs exhibit subpar performance in specialized warehouse-level code generation tasks for aerospace, with their performance markedly lower than that of domain tasks. The research further demonstrates that Retrieval Augmented Generation (RAG) technology can effectively enhance LLMs’ code generation capabilities. Additionally, the use of appropriate prompt templates can guide the models to achieve superior results. Moreover, high-quality documentation strings are found to be crucial in improving LLMs’ performance in warehouse-level code generation tasks. This study provides a vital reference for leveraging LLMs for code generation in the aerospace field, thereby fostering technological innovation and progress in this critical domain.https://www.mdpi.com/2226-4310/12/6/498aerospacecode generationLLMbenchmarkretrieval augmented
spellingShingle Rui He
Liang Zhang
Mengyao Lyu
Liangqing Lyu
Changbin Xue
Using Large Language Models for Aerospace Code Generation: Methods, Benchmarks, and Potential Values
Aerospace
aerospace
code generation
LLM
benchmark
retrieval augmented
title Using Large Language Models for Aerospace Code Generation: Methods, Benchmarks, and Potential Values
title_full Using Large Language Models for Aerospace Code Generation: Methods, Benchmarks, and Potential Values
title_fullStr Using Large Language Models for Aerospace Code Generation: Methods, Benchmarks, and Potential Values
title_full_unstemmed Using Large Language Models for Aerospace Code Generation: Methods, Benchmarks, and Potential Values
title_short Using Large Language Models for Aerospace Code Generation: Methods, Benchmarks, and Potential Values
title_sort using large language models for aerospace code generation methods benchmarks and potential values
topic aerospace
code generation
LLM
benchmark
retrieval augmented
url https://www.mdpi.com/2226-4310/12/6/498
work_keys_str_mv AT ruihe usinglargelanguagemodelsforaerospacecodegenerationmethodsbenchmarksandpotentialvalues
AT liangzhang usinglargelanguagemodelsforaerospacecodegenerationmethodsbenchmarksandpotentialvalues
AT mengyaolyu usinglargelanguagemodelsforaerospacecodegenerationmethodsbenchmarksandpotentialvalues
AT liangqinglyu usinglargelanguagemodelsforaerospacecodegenerationmethodsbenchmarksandpotentialvalues
AT changbinxue usinglargelanguagemodelsforaerospacecodegenerationmethodsbenchmarksandpotentialvalues