Using Large Language Models for Aerospace Code Generation: Methods, Benchmarks, and Potential Values
In recent years, Large Language Models (LLMs) have witnessed rapid advancements, revolutionizing various domains. Within the realm of software development, code generation technology powered by LLMs has emerged as a prominent research focus. Despite its potential, the application of this technology...
Saved in:
| Main Authors: | , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-05-01
|
| Series: | Aerospace |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2226-4310/12/6/498 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849435304937652224 |
|---|---|
| author | Rui He Liang Zhang Mengyao Lyu Liangqing Lyu Changbin Xue |
| author_facet | Rui He Liang Zhang Mengyao Lyu Liangqing Lyu Changbin Xue |
| author_sort | Rui He |
| collection | DOAJ |
| description | In recent years, Large Language Models (LLMs) have witnessed rapid advancements, revolutionizing various domains. Within the realm of software development, code generation technology powered by LLMs has emerged as a prominent research focus. Despite its potential, the application of this technology in the aerospace sector remains in its nascent, exploratory phase. This paper delves into the intricacies of LLM-based code generation methods and explores their potential applications in aerospace contexts. It introduces RepoSpace, the pioneering warehouse-level benchmark test for code generation of spaceborne equipment. Comprising 825 samples from five actual projects, this benchmark offers a more precise evaluation of LLMs’ capabilities in aerospace scenarios. Through extensive evaluations of seven state-of-the-art LLMs on RepoSpace, the study reveals that domain-specific differences significantly impact the code generation performance of LLMs. Existing LLMs exhibit subpar performance in specialized warehouse-level code generation tasks for aerospace, with their performance markedly lower than that of domain tasks. The research further demonstrates that Retrieval Augmented Generation (RAG) technology can effectively enhance LLMs’ code generation capabilities. Additionally, the use of appropriate prompt templates can guide the models to achieve superior results. Moreover, high-quality documentation strings are found to be crucial in improving LLMs’ performance in warehouse-level code generation tasks. This study provides a vital reference for leveraging LLMs for code generation in the aerospace field, thereby fostering technological innovation and progress in this critical domain. |
| format | Article |
| id | doaj-art-7b2df165985d461fad3d868ab8f3c50d |
| institution | Kabale University |
| issn | 2226-4310 |
| language | English |
| publishDate | 2025-05-01 |
| publisher | MDPI AG |
| record_format | Article |
| series | Aerospace |
| spelling | doaj-art-7b2df165985d461fad3d868ab8f3c50d2025-08-20T03:26:20ZengMDPI AGAerospace2226-43102025-05-0112649810.3390/aerospace12060498Using Large Language Models for Aerospace Code Generation: Methods, Benchmarks, and Potential ValuesRui He0Liang Zhang1Mengyao Lyu2Liangqing Lyu3Changbin Xue4Key Laboratory of Electronics and Information Technology for Space System, National Space Science Center, Chinese Academy of Sciences, Beijing 100190, ChinaKey Laboratory of Electronics and Information Technology for Space System, National Space Science Center, Chinese Academy of Sciences, Beijing 100190, ChinaKey Laboratory of Electronics and Information Technology for Space System, National Space Science Center, Chinese Academy of Sciences, Beijing 100190, ChinaKey Laboratory of Electronics and Information Technology for Space System, National Space Science Center, Chinese Academy of Sciences, Beijing 100190, ChinaKey Laboratory of Electronics and Information Technology for Space System, National Space Science Center, Chinese Academy of Sciences, Beijing 100190, ChinaIn recent years, Large Language Models (LLMs) have witnessed rapid advancements, revolutionizing various domains. Within the realm of software development, code generation technology powered by LLMs has emerged as a prominent research focus. Despite its potential, the application of this technology in the aerospace sector remains in its nascent, exploratory phase. This paper delves into the intricacies of LLM-based code generation methods and explores their potential applications in aerospace contexts. It introduces RepoSpace, the pioneering warehouse-level benchmark test for code generation of spaceborne equipment. Comprising 825 samples from five actual projects, this benchmark offers a more precise evaluation of LLMs’ capabilities in aerospace scenarios. Through extensive evaluations of seven state-of-the-art LLMs on RepoSpace, the study reveals that domain-specific differences significantly impact the code generation performance of LLMs. Existing LLMs exhibit subpar performance in specialized warehouse-level code generation tasks for aerospace, with their performance markedly lower than that of domain tasks. The research further demonstrates that Retrieval Augmented Generation (RAG) technology can effectively enhance LLMs’ code generation capabilities. Additionally, the use of appropriate prompt templates can guide the models to achieve superior results. Moreover, high-quality documentation strings are found to be crucial in improving LLMs’ performance in warehouse-level code generation tasks. This study provides a vital reference for leveraging LLMs for code generation in the aerospace field, thereby fostering technological innovation and progress in this critical domain.https://www.mdpi.com/2226-4310/12/6/498aerospacecode generationLLMbenchmarkretrieval augmented |
| spellingShingle | Rui He Liang Zhang Mengyao Lyu Liangqing Lyu Changbin Xue Using Large Language Models for Aerospace Code Generation: Methods, Benchmarks, and Potential Values Aerospace aerospace code generation LLM benchmark retrieval augmented |
| title | Using Large Language Models for Aerospace Code Generation: Methods, Benchmarks, and Potential Values |
| title_full | Using Large Language Models for Aerospace Code Generation: Methods, Benchmarks, and Potential Values |
| title_fullStr | Using Large Language Models for Aerospace Code Generation: Methods, Benchmarks, and Potential Values |
| title_full_unstemmed | Using Large Language Models for Aerospace Code Generation: Methods, Benchmarks, and Potential Values |
| title_short | Using Large Language Models for Aerospace Code Generation: Methods, Benchmarks, and Potential Values |
| title_sort | using large language models for aerospace code generation methods benchmarks and potential values |
| topic | aerospace code generation LLM benchmark retrieval augmented |
| url | https://www.mdpi.com/2226-4310/12/6/498 |
| work_keys_str_mv | AT ruihe usinglargelanguagemodelsforaerospacecodegenerationmethodsbenchmarksandpotentialvalues AT liangzhang usinglargelanguagemodelsforaerospacecodegenerationmethodsbenchmarksandpotentialvalues AT mengyaolyu usinglargelanguagemodelsforaerospacecodegenerationmethodsbenchmarksandpotentialvalues AT liangqinglyu usinglargelanguagemodelsforaerospacecodegenerationmethodsbenchmarksandpotentialvalues AT changbinxue usinglargelanguagemodelsforaerospacecodegenerationmethodsbenchmarksandpotentialvalues |