BIMCoder: A Comprehensive Large Language Model Fusion Framework for Natural Language-Based BIM Information Retrieval
Building Information Modeling (BIM) has excellent potential to enhance building operation and maintenance. However, as a standardized data format in the architecture, engineering, and construction (AEC) industry, the retrieval of BIM information generally requires specialized software. Cumbersome so...
Saved in:
| Main Authors: | , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-07-01
|
| Series: | Applied Sciences |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2076-3417/15/14/7647 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850077026646491136 |
|---|---|
| author | Bingru Liu Hainan Chen |
| author_facet | Bingru Liu Hainan Chen |
| author_sort | Bingru Liu |
| collection | DOAJ |
| description | Building Information Modeling (BIM) has excellent potential to enhance building operation and maintenance. However, as a standardized data format in the architecture, engineering, and construction (AEC) industry, the retrieval of BIM information generally requires specialized software. Cumbersome software operations prevent its effective application in the actual operation and management of buildings. This paper presents BIMCoder, a model designed to translate natural language queries into structured query statements compatible with professional BIM software (e.g., BIMserver v1.5). It serves as an intermediary component between users and various BIM platforms, facilitating access for users without specialized BIM knowledge. A dedicated BIM information query dataset was constructed, comprising 1680 natural language query and structured BIM query string pairs, categorized into 12 groups. Three classical pre-trained large language models (LLMs) (ERNIE 3.0, Llama-13B, and SQLCoder) were evaluated on this dataset. A fine-tuned model based on SQLCoder was then trained. Subsequently, a fusion model (BIMCoder) integrating ERNIE and SQLCoder was designed. Test results demonstrate that the proposed BIMCoder model achieves an outstanding accurate matching rate of 87.16% and an Execution Accuracy rate of 88.75% for natural language-based BIM information retrieval. This study confirms the feasibility of natural language-based BIM information retrieval and offers a novel solution to reduce the complexity of BIM system interaction. |
| format | Article |
| id | doaj-art-a711b8f651484fdc8dcd4456c8471851 |
| institution | DOAJ |
| issn | 2076-3417 |
| language | English |
| publishDate | 2025-07-01 |
| publisher | MDPI AG |
| record_format | Article |
| series | Applied Sciences |
| spelling | doaj-art-a711b8f651484fdc8dcd4456c84718512025-08-20T02:45:53ZengMDPI AGApplied Sciences2076-34172025-07-011514764710.3390/app15147647BIMCoder: A Comprehensive Large Language Model Fusion Framework for Natural Language-Based BIM Information RetrievalBingru Liu0Hainan Chen1College of Computer Science and Technology, Zhejiang Sci-Tech University, Hangzhou 310018, ChinaCollege of Computer Science and Technology, Zhejiang Sci-Tech University, Hangzhou 310018, ChinaBuilding Information Modeling (BIM) has excellent potential to enhance building operation and maintenance. However, as a standardized data format in the architecture, engineering, and construction (AEC) industry, the retrieval of BIM information generally requires specialized software. Cumbersome software operations prevent its effective application in the actual operation and management of buildings. This paper presents BIMCoder, a model designed to translate natural language queries into structured query statements compatible with professional BIM software (e.g., BIMserver v1.5). It serves as an intermediary component between users and various BIM platforms, facilitating access for users without specialized BIM knowledge. A dedicated BIM information query dataset was constructed, comprising 1680 natural language query and structured BIM query string pairs, categorized into 12 groups. Three classical pre-trained large language models (LLMs) (ERNIE 3.0, Llama-13B, and SQLCoder) were evaluated on this dataset. A fine-tuned model based on SQLCoder was then trained. Subsequently, a fusion model (BIMCoder) integrating ERNIE and SQLCoder was designed. Test results demonstrate that the proposed BIMCoder model achieves an outstanding accurate matching rate of 87.16% and an Execution Accuracy rate of 88.75% for natural language-based BIM information retrieval. This study confirms the feasibility of natural language-based BIM information retrieval and offers a novel solution to reduce the complexity of BIM system interaction.https://www.mdpi.com/2076-3417/15/14/7647BIM information retrievallarge language modelnatural language based BIM operation |
| spellingShingle | Bingru Liu Hainan Chen BIMCoder: A Comprehensive Large Language Model Fusion Framework for Natural Language-Based BIM Information Retrieval Applied Sciences BIM information retrieval large language model natural language based BIM operation |
| title | BIMCoder: A Comprehensive Large Language Model Fusion Framework for Natural Language-Based BIM Information Retrieval |
| title_full | BIMCoder: A Comprehensive Large Language Model Fusion Framework for Natural Language-Based BIM Information Retrieval |
| title_fullStr | BIMCoder: A Comprehensive Large Language Model Fusion Framework for Natural Language-Based BIM Information Retrieval |
| title_full_unstemmed | BIMCoder: A Comprehensive Large Language Model Fusion Framework for Natural Language-Based BIM Information Retrieval |
| title_short | BIMCoder: A Comprehensive Large Language Model Fusion Framework for Natural Language-Based BIM Information Retrieval |
| title_sort | bimcoder a comprehensive large language model fusion framework for natural language based bim information retrieval |
| topic | BIM information retrieval large language model natural language based BIM operation |
| url | https://www.mdpi.com/2076-3417/15/14/7647 |
| work_keys_str_mv | AT bingruliu bimcoderacomprehensivelargelanguagemodelfusionframeworkfornaturallanguagebasedbiminformationretrieval AT hainanchen bimcoderacomprehensivelargelanguagemodelfusionframeworkfornaturallanguagebasedbiminformationretrieval |