PIDQA—Question Answering on Piping and Instrumentation Diagrams
This paper introduces a novel framework enabling natural language question answering on Piping and Instrumentation Diagrams (P&IDs), addressing a critical gap between engineering design documentation and intuitive information retrieval. Our approach transforms static P&IDs into queryable kno...
Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-04-01
|
| Series: | Machine Learning and Knowledge Extraction |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2504-4990/7/2/39 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849705966097924096 |
|---|---|
| author | Mohit Gupta Chialing Wei Thomas Czerniawski Ricardo Eiris |
| author_facet | Mohit Gupta Chialing Wei Thomas Czerniawski Ricardo Eiris |
| author_sort | Mohit Gupta |
| collection | DOAJ |
| description | This paper introduces a novel framework enabling natural language question answering on Piping and Instrumentation Diagrams (P&IDs), addressing a critical gap between engineering design documentation and intuitive information retrieval. Our approach transforms static P&IDs into queryable knowledge bases through a three-stage pipeline. First, we recognize entities in a P&ID image and organize their relationships to form a base entity graph. Second, this entity graph is converted into a Labeled Property Graph (LPG), enriched with semantic attributes for nodes and edges. Third, a Large Language Model (LLM)-based information retrieval system translates a user query into a graph query language (Cypher) and retrieves the answer by executing it on LPG. For our experiments, we augmented a publicly available P&ID image dataset with our novel PIDQA dataset, which comprises 64,000 question–answer pairs spanning four categories: (I) simple counting, (II) spatial counting, (III) spatial connections, and (IV) value-based questions. Our experiments (using gpt-3.5-turbo) demonstrate that grounding the LLM with dynamic few-shot sampling robustly elevates accuracy by 10.6–43.5% over schema contextualization alone, even under high lexical diversity conditions (e.g., paraphrasing, ambiguity). By reducing barriers in retrieving P&ID data, this work advances human–AI collaboration for industrial workflows in design validation and safety audits. |
| format | Article |
| id | doaj-art-3384a1280c04463bbaa3e5fe7b757bff |
| institution | DOAJ |
| issn | 2504-4990 |
| language | English |
| publishDate | 2025-04-01 |
| publisher | MDPI AG |
| record_format | Article |
| series | Machine Learning and Knowledge Extraction |
| spelling | doaj-art-3384a1280c04463bbaa3e5fe7b757bff2025-08-20T03:16:19ZengMDPI AGMachine Learning and Knowledge Extraction2504-49902025-04-01723910.3390/make7020039PIDQA—Question Answering on Piping and Instrumentation DiagramsMohit Gupta0Chialing Wei1Thomas Czerniawski2Ricardo Eiris3School of Sustainable Engineering and the Built Environment, Arizona State University, Tempe, AZ 85287-1404, USASchool of Sustainable Engineering and the Built Environment, Arizona State University, Tempe, AZ 85287-1404, USASchool of Sustainable Engineering and the Built Environment, Arizona State University, Tempe, AZ 85287-1404, USASchool of Sustainable Engineering and the Built Environment, Arizona State University, Tempe, AZ 85287-1404, USAThis paper introduces a novel framework enabling natural language question answering on Piping and Instrumentation Diagrams (P&IDs), addressing a critical gap between engineering design documentation and intuitive information retrieval. Our approach transforms static P&IDs into queryable knowledge bases through a three-stage pipeline. First, we recognize entities in a P&ID image and organize their relationships to form a base entity graph. Second, this entity graph is converted into a Labeled Property Graph (LPG), enriched with semantic attributes for nodes and edges. Third, a Large Language Model (LLM)-based information retrieval system translates a user query into a graph query language (Cypher) and retrieves the answer by executing it on LPG. For our experiments, we augmented a publicly available P&ID image dataset with our novel PIDQA dataset, which comprises 64,000 question–answer pairs spanning four categories: (I) simple counting, (II) spatial counting, (III) spatial connections, and (IV) value-based questions. Our experiments (using gpt-3.5-turbo) demonstrate that grounding the LLM with dynamic few-shot sampling robustly elevates accuracy by 10.6–43.5% over schema contextualization alone, even under high lexical diversity conditions (e.g., paraphrasing, ambiguity). By reducing barriers in retrieving P&ID data, this work advances human–AI collaboration for industrial workflows in design validation and safety audits.https://www.mdpi.com/2504-4990/7/2/39P&IDinformation retrievalknowledge graphsquestion answeringRAGNeo4j |
| spellingShingle | Mohit Gupta Chialing Wei Thomas Czerniawski Ricardo Eiris PIDQA—Question Answering on Piping and Instrumentation Diagrams Machine Learning and Knowledge Extraction P&ID information retrieval knowledge graphs question answering RAG Neo4j |
| title | PIDQA—Question Answering on Piping and Instrumentation Diagrams |
| title_full | PIDQA—Question Answering on Piping and Instrumentation Diagrams |
| title_fullStr | PIDQA—Question Answering on Piping and Instrumentation Diagrams |
| title_full_unstemmed | PIDQA—Question Answering on Piping and Instrumentation Diagrams |
| title_short | PIDQA—Question Answering on Piping and Instrumentation Diagrams |
| title_sort | pidqa question answering on piping and instrumentation diagrams |
| topic | P&ID information retrieval knowledge graphs question answering RAG Neo4j |
| url | https://www.mdpi.com/2504-4990/7/2/39 |
| work_keys_str_mv | AT mohitgupta pidqaquestionansweringonpipingandinstrumentationdiagrams AT chialingwei pidqaquestionansweringonpipingandinstrumentationdiagrams AT thomasczerniawski pidqaquestionansweringonpipingandinstrumentationdiagrams AT ricardoeiris pidqaquestionansweringonpipingandinstrumentationdiagrams |