BIMCoder: A Comprehensive Large Language Model Fusion Framework for Natural Language-Based BIM Information Retrieval

Building Information Modeling (BIM) has excellent potential to enhance building operation and maintenance. However, as a standardized data format in the architecture, engineering, and construction (AEC) industry, the retrieval of BIM information generally requires specialized software. Cumbersome so...

Full description

Saved in:
Bibliographic Details
Main Authors: Bingru Liu, Hainan Chen
Format: Article
Language:English
Published: MDPI AG 2025-07-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/15/14/7647
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850077026646491136
author Bingru Liu
Hainan Chen
author_facet Bingru Liu
Hainan Chen
author_sort Bingru Liu
collection DOAJ
description Building Information Modeling (BIM) has excellent potential to enhance building operation and maintenance. However, as a standardized data format in the architecture, engineering, and construction (AEC) industry, the retrieval of BIM information generally requires specialized software. Cumbersome software operations prevent its effective application in the actual operation and management of buildings. This paper presents BIMCoder, a model designed to translate natural language queries into structured query statements compatible with professional BIM software (e.g., BIMserver v1.5). It serves as an intermediary component between users and various BIM platforms, facilitating access for users without specialized BIM knowledge. A dedicated BIM information query dataset was constructed, comprising 1680 natural language query and structured BIM query string pairs, categorized into 12 groups. Three classical pre-trained large language models (LLMs) (ERNIE 3.0, Llama-13B, and SQLCoder) were evaluated on this dataset. A fine-tuned model based on SQLCoder was then trained. Subsequently, a fusion model (BIMCoder) integrating ERNIE and SQLCoder was designed. Test results demonstrate that the proposed BIMCoder model achieves an outstanding accurate matching rate of 87.16% and an Execution Accuracy rate of 88.75% for natural language-based BIM information retrieval. This study confirms the feasibility of natural language-based BIM information retrieval and offers a novel solution to reduce the complexity of BIM system interaction.
format Article
id doaj-art-a711b8f651484fdc8dcd4456c8471851
institution DOAJ
issn 2076-3417
language English
publishDate 2025-07-01
publisher MDPI AG
record_format Article
series Applied Sciences
spelling doaj-art-a711b8f651484fdc8dcd4456c84718512025-08-20T02:45:53ZengMDPI AGApplied Sciences2076-34172025-07-011514764710.3390/app15147647BIMCoder: A Comprehensive Large Language Model Fusion Framework for Natural Language-Based BIM Information RetrievalBingru Liu0Hainan Chen1College of Computer Science and Technology, Zhejiang Sci-Tech University, Hangzhou 310018, ChinaCollege of Computer Science and Technology, Zhejiang Sci-Tech University, Hangzhou 310018, ChinaBuilding Information Modeling (BIM) has excellent potential to enhance building operation and maintenance. However, as a standardized data format in the architecture, engineering, and construction (AEC) industry, the retrieval of BIM information generally requires specialized software. Cumbersome software operations prevent its effective application in the actual operation and management of buildings. This paper presents BIMCoder, a model designed to translate natural language queries into structured query statements compatible with professional BIM software (e.g., BIMserver v1.5). It serves as an intermediary component between users and various BIM platforms, facilitating access for users without specialized BIM knowledge. A dedicated BIM information query dataset was constructed, comprising 1680 natural language query and structured BIM query string pairs, categorized into 12 groups. Three classical pre-trained large language models (LLMs) (ERNIE 3.0, Llama-13B, and SQLCoder) were evaluated on this dataset. A fine-tuned model based on SQLCoder was then trained. Subsequently, a fusion model (BIMCoder) integrating ERNIE and SQLCoder was designed. Test results demonstrate that the proposed BIMCoder model achieves an outstanding accurate matching rate of 87.16% and an Execution Accuracy rate of 88.75% for natural language-based BIM information retrieval. This study confirms the feasibility of natural language-based BIM information retrieval and offers a novel solution to reduce the complexity of BIM system interaction.https://www.mdpi.com/2076-3417/15/14/7647BIM information retrievallarge language modelnatural language based BIM operation
spellingShingle Bingru Liu
Hainan Chen
BIMCoder: A Comprehensive Large Language Model Fusion Framework for Natural Language-Based BIM Information Retrieval
Applied Sciences
BIM information retrieval
large language model
natural language based BIM operation
title BIMCoder: A Comprehensive Large Language Model Fusion Framework for Natural Language-Based BIM Information Retrieval
title_full BIMCoder: A Comprehensive Large Language Model Fusion Framework for Natural Language-Based BIM Information Retrieval
title_fullStr BIMCoder: A Comprehensive Large Language Model Fusion Framework for Natural Language-Based BIM Information Retrieval
title_full_unstemmed BIMCoder: A Comprehensive Large Language Model Fusion Framework for Natural Language-Based BIM Information Retrieval
title_short BIMCoder: A Comprehensive Large Language Model Fusion Framework for Natural Language-Based BIM Information Retrieval
title_sort bimcoder a comprehensive large language model fusion framework for natural language based bim information retrieval
topic BIM information retrieval
large language model
natural language based BIM operation
url https://www.mdpi.com/2076-3417/15/14/7647
work_keys_str_mv AT bingruliu bimcoderacomprehensivelargelanguagemodelfusionframeworkfornaturallanguagebasedbiminformationretrieval
AT hainanchen bimcoderacomprehensivelargelanguagemodelfusionframeworkfornaturallanguagebasedbiminformationretrieval