Using Large Language Models for Aerospace Code Generation: Methods, Benchmarks, and Potential Values

In recent years, Large Language Models (LLMs) have witnessed rapid advancements, revolutionizing various domains. Within the realm of software development, code generation technology powered by LLMs has emerged as a prominent research focus. Despite its potential, the application of this technology...

Full description

Saved in:

Bibliographic Details
Main Authors:	Rui He, Liang Zhang, Mengyao Lyu, Liangqing Lyu, Changbin Xue
Format:	Article
Language:	English
Published:	MDPI AG 2025-05-01
Series:	Aerospace
Subjects:	aerospace code generation LLM benchmark retrieval augmented
Online Access:	https://www.mdpi.com/2226-4310/12/6/498
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1849435304937652224
author	Rui He Liang Zhang Mengyao Lyu Liangqing Lyu Changbin Xue
author_facet	Rui He Liang Zhang Mengyao Lyu Liangqing Lyu Changbin Xue
author_sort	Rui He
collection	DOAJ
description	In recent years, Large Language Models (LLMs) have witnessed rapid advancements, revolutionizing various domains. Within the realm of software development, code generation technology powered by LLMs has emerged as a prominent research focus. Despite its potential, the application of this technology in the aerospace sector remains in its nascent, exploratory phase. This paper delves into the intricacies of LLM-based code generation methods and explores their potential applications in aerospace contexts. It introduces RepoSpace, the pioneering warehouse-level benchmark test for code generation of spaceborne equipment. Comprising 825 samples from five actual projects, this benchmark offers a more precise evaluation of LLMs’ capabilities in aerospace scenarios. Through extensive evaluations of seven state-of-the-art LLMs on RepoSpace, the study reveals that domain-specific differences significantly impact the code generation performance of LLMs. Existing LLMs exhibit subpar performance in specialized warehouse-level code generation tasks for aerospace, with their performance markedly lower than that of domain tasks. The research further demonstrates that Retrieval Augmented Generation (RAG) technology can effectively enhance LLMs’ code generation capabilities. Additionally, the use of appropriate prompt templates can guide the models to achieve superior results. Moreover, high-quality documentation strings are found to be crucial in improving LLMs’ performance in warehouse-level code generation tasks. This study provides a vital reference for leveraging LLMs for code generation in the aerospace field, thereby fostering technological innovation and progress in this critical domain.
format	Article
id	doaj-art-7b2df165985d461fad3d868ab8f3c50d
institution	Kabale University
issn	2226-4310
language	English
publishDate	2025-05-01
publisher	MDPI AG
record_format	Article
series	Aerospace
spelling	doaj-art-7b2df165985d461fad3d868ab8f3c50d2025-08-20T03:26:20ZengMDPI AGAerospace2226-43102025-05-0112649810.3390/aerospace12060498Using Large Language Models for Aerospace Code Generation: Methods, Benchmarks, and Potential ValuesRui He0Liang Zhang1Mengyao Lyu2Liangqing Lyu3Changbin Xue4Key Laboratory of Electronics and Information Technology for Space System, National Space Science Center, Chinese Academy of Sciences, Beijing 100190, ChinaKey Laboratory of Electronics and Information Technology for Space System, National Space Science Center, Chinese Academy of Sciences, Beijing 100190, ChinaKey Laboratory of Electronics and Information Technology for Space System, National Space Science Center, Chinese Academy of Sciences, Beijing 100190, ChinaKey Laboratory of Electronics and Information Technology for Space System, National Space Science Center, Chinese Academy of Sciences, Beijing 100190, ChinaKey Laboratory of Electronics and Information Technology for Space System, National Space Science Center, Chinese Academy of Sciences, Beijing 100190, ChinaIn recent years, Large Language Models (LLMs) have witnessed rapid advancements, revolutionizing various domains. Within the realm of software development, code generation technology powered by LLMs has emerged as a prominent research focus. Despite its potential, the application of this technology in the aerospace sector remains in its nascent, exploratory phase. This paper delves into the intricacies of LLM-based code generation methods and explores their potential applications in aerospace contexts. It introduces RepoSpace, the pioneering warehouse-level benchmark test for code generation of spaceborne equipment. Comprising 825 samples from five actual projects, this benchmark offers a more precise evaluation of LLMs’ capabilities in aerospace scenarios. Through extensive evaluations of seven state-of-the-art LLMs on RepoSpace, the study reveals that domain-specific differences significantly impact the code generation performance of LLMs. Existing LLMs exhibit subpar performance in specialized warehouse-level code generation tasks for aerospace, with their performance markedly lower than that of domain tasks. The research further demonstrates that Retrieval Augmented Generation (RAG) technology can effectively enhance LLMs’ code generation capabilities. Additionally, the use of appropriate prompt templates can guide the models to achieve superior results. Moreover, high-quality documentation strings are found to be crucial in improving LLMs’ performance in warehouse-level code generation tasks. This study provides a vital reference for leveraging LLMs for code generation in the aerospace field, thereby fostering technological innovation and progress in this critical domain.https://www.mdpi.com/2226-4310/12/6/498aerospacecode generationLLMbenchmarkretrieval augmented
spellingShingle	Rui He Liang Zhang Mengyao Lyu Liangqing Lyu Changbin Xue Using Large Language Models for Aerospace Code Generation: Methods, Benchmarks, and Potential Values Aerospace aerospace code generation LLM benchmark retrieval augmented
title	Using Large Language Models for Aerospace Code Generation: Methods, Benchmarks, and Potential Values
title_full	Using Large Language Models for Aerospace Code Generation: Methods, Benchmarks, and Potential Values
title_fullStr	Using Large Language Models for Aerospace Code Generation: Methods, Benchmarks, and Potential Values
title_full_unstemmed	Using Large Language Models for Aerospace Code Generation: Methods, Benchmarks, and Potential Values
title_short	Using Large Language Models for Aerospace Code Generation: Methods, Benchmarks, and Potential Values
title_sort	using large language models for aerospace code generation methods benchmarks and potential values
topic	aerospace code generation LLM benchmark retrieval augmented
url	https://www.mdpi.com/2226-4310/12/6/498
work_keys_str_mv	AT ruihe usinglargelanguagemodelsforaerospacecodegenerationmethodsbenchmarksandpotentialvalues AT liangzhang usinglargelanguagemodelsforaerospacecodegenerationmethodsbenchmarksandpotentialvalues AT mengyaolyu usinglargelanguagemodelsforaerospacecodegenerationmethodsbenchmarksandpotentialvalues AT liangqinglyu usinglargelanguagemodelsforaerospacecodegenerationmethodsbenchmarksandpotentialvalues AT changbinxue usinglargelanguagemodelsforaerospacecodegenerationmethodsbenchmarksandpotentialvalues

Using Large Language Models for Aerospace Code Generation: Methods, Benchmarks, and Potential Values

Similar Items