PIDQA—Question Answering on Piping and Instrumentation Diagrams

This paper introduces a novel framework enabling natural language question answering on Piping and Instrumentation Diagrams (P&IDs), addressing a critical gap between engineering design documentation and intuitive information retrieval. Our approach transforms static P&IDs into queryable kno...

Full description

Saved in:
Bibliographic Details
Main Authors: Mohit Gupta, Chialing Wei, Thomas Czerniawski, Ricardo Eiris
Format: Article
Language:English
Published: MDPI AG 2025-04-01
Series:Machine Learning and Knowledge Extraction
Subjects:
Online Access:https://www.mdpi.com/2504-4990/7/2/39
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:This paper introduces a novel framework enabling natural language question answering on Piping and Instrumentation Diagrams (P&IDs), addressing a critical gap between engineering design documentation and intuitive information retrieval. Our approach transforms static P&IDs into queryable knowledge bases through a three-stage pipeline. First, we recognize entities in a P&ID image and organize their relationships to form a base entity graph. Second, this entity graph is converted into a Labeled Property Graph (LPG), enriched with semantic attributes for nodes and edges. Third, a Large Language Model (LLM)-based information retrieval system translates a user query into a graph query language (Cypher) and retrieves the answer by executing it on LPG. For our experiments, we augmented a publicly available P&ID image dataset with our novel PIDQA dataset, which comprises 64,000 question–answer pairs spanning four categories: (I) simple counting, (II) spatial counting, (III) spatial connections, and (IV) value-based questions. Our experiments (using gpt-3.5-turbo) demonstrate that grounding the LLM with dynamic few-shot sampling robustly elevates accuracy by 10.6–43.5% over schema contextualization alone, even under high lexical diversity conditions (e.g., paraphrasing, ambiguity). By reducing barriers in retrieving P&ID data, this work advances human–AI collaboration for industrial workflows in design validation and safety audits.
ISSN:2504-4990