Natural language access point to digital metal–organic polyhedra chemistry in The World Avatar
Metal–organic polyhedra (MOPs) are discrete, porous metal–organic assemblies known for their wide-ranging applications in separation, drug delivery, and catalysis. As part of The World Avatar (TWA) project—a universal and interoperable knowledge model—we have previously systematized known MOPs and e...
Saved in:
| Main Authors: | , , , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Cambridge University Press
2025-01-01
|
| Series: | Data-Centric Engineering |
| Subjects: | |
| Online Access: | https://www.cambridge.org/core/product/identifier/S2632673625000127/type/journal_article |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850069208336957440 |
|---|---|
| author | Simon D. Rihm Dan N. Tran Aleksandar Kondinski Laura Pascazio Fabio Saluz Xinhong Deng Sebastian Mosbach Jethro Akroyd Markus Kraft |
| author_facet | Simon D. Rihm Dan N. Tran Aleksandar Kondinski Laura Pascazio Fabio Saluz Xinhong Deng Sebastian Mosbach Jethro Akroyd Markus Kraft |
| author_sort | Simon D. Rihm |
| collection | DOAJ |
| description | Metal–organic polyhedra (MOPs) are discrete, porous metal–organic assemblies known for their wide-ranging applications in separation, drug delivery, and catalysis. As part of The World Avatar (TWA) project—a universal and interoperable knowledge model—we have previously systematized known MOPs and expanded the explorable MOP space with novel targets. Although these data are available via a complex query language, a more user-friendly interface is desirable to enhance accessibility. To address a similar challenge in other chemistry domains, the natural language question-answering system “Marie” has been developed; however, its scalability is limited due to its reliance on supervised fine-tuning, which hinders its adaptability to new knowledge domains. In this article, we introduce an enhanced database of MOPs and a first-of-its-kind question-answering system tailored for MOP chemistry. By augmenting TWA’s MOP database with geometry data, we enable the visualization of not just empirically verified MOP structures but also machine-predicted ones. In addition, we renovated Marie’s semantic parser to adopt in-context few-shot learning, allowing seamless interaction with TWA’s extensive MOP repository. These advancements significantly improve the accessibility and versatility of TWA, marking an important step toward accelerating and automating the development of reticular materials with the aid of digital assistants. |
| format | Article |
| id | doaj-art-c3fe60ec4e65461994783e9eba5ea0b3 |
| institution | DOAJ |
| issn | 2632-6736 |
| language | English |
| publishDate | 2025-01-01 |
| publisher | Cambridge University Press |
| record_format | Article |
| series | Data-Centric Engineering |
| spelling | doaj-art-c3fe60ec4e65461994783e9eba5ea0b32025-08-20T02:47:49ZengCambridge University PressData-Centric Engineering2632-67362025-01-01610.1017/dce.2025.12Natural language access point to digital metal–organic polyhedra chemistry in The World AvatarSimon D. Rihm0https://orcid.org/0000-0001-8342-7269Dan N. Tran1https://orcid.org/0000-0002-8980-7200Aleksandar Kondinski2https://orcid.org/0000-0002-0559-0172Laura Pascazio3https://orcid.org/0000-0003-4084-995XFabio Saluz4Xinhong Deng5Sebastian Mosbach6https://orcid.org/0000-0001-7018-9433Jethro Akroyd7https://orcid.org/0000-0002-2143-8656Markus Kraft8https://orcid.org/0000-0002-4293-8924Department of Chemical Engineering and Biotechnology, University of Cambridge, Cambridge, UKCARES, Cambridge Centre for Advanced Research and Education in Singapore, Singapore, SingaporeDepartment of Chemical Engineering and Biotechnology, University of Cambridge, Cambridge, UKCARES, Cambridge Centre for Advanced Research and Education in Singapore, Singapore, SingaporeDepartment of Chemical Engineering and Biotechnology, University of Cambridge, Cambridge, UK Department of Mechanical and Process Engineering, ETH Zurich, Zurich, SwitzerlandCARES, Cambridge Centre for Advanced Research and Education in Singapore, Singapore, SingaporeDepartment of Chemical Engineering and Biotechnology, University of Cambridge, Cambridge, UK CARES, Cambridge Centre for Advanced Research and Education in Singapore, Singapore, Singapore CMCL, Cambridge, UKDepartment of Chemical Engineering and Biotechnology, University of Cambridge, Cambridge, UK CARES, Cambridge Centre for Advanced Research and Education in Singapore, Singapore, Singapore CMCL, Cambridge, UKDepartment of Chemical Engineering and Biotechnology, University of Cambridge, Cambridge, UK CARES, Cambridge Centre for Advanced Research and Education in Singapore, Singapore, Singapore CMCL, Cambridge, UKMetal–organic polyhedra (MOPs) are discrete, porous metal–organic assemblies known for their wide-ranging applications in separation, drug delivery, and catalysis. As part of The World Avatar (TWA) project—a universal and interoperable knowledge model—we have previously systematized known MOPs and expanded the explorable MOP space with novel targets. Although these data are available via a complex query language, a more user-friendly interface is desirable to enhance accessibility. To address a similar challenge in other chemistry domains, the natural language question-answering system “Marie” has been developed; however, its scalability is limited due to its reliance on supervised fine-tuning, which hinders its adaptability to new knowledge domains. In this article, we introduce an enhanced database of MOPs and a first-of-its-kind question-answering system tailored for MOP chemistry. By augmenting TWA’s MOP database with geometry data, we enable the visualization of not just empirically verified MOP structures but also machine-predicted ones. In addition, we renovated Marie’s semantic parser to adopt in-context few-shot learning, allowing seamless interaction with TWA’s extensive MOP repository. These advancements significantly improve the accessibility and versatility of TWA, marking an important step toward accelerating and automating the development of reticular materials with the aid of digital assistants.https://www.cambridge.org/core/product/identifier/S2632673625000127/type/journal_articledynamic knowledge graphsmetal–organic polyhedraquestion-answering systemsretrieval-augmented generation |
| spellingShingle | Simon D. Rihm Dan N. Tran Aleksandar Kondinski Laura Pascazio Fabio Saluz Xinhong Deng Sebastian Mosbach Jethro Akroyd Markus Kraft Natural language access point to digital metal–organic polyhedra chemistry in The World Avatar Data-Centric Engineering dynamic knowledge graphs metal–organic polyhedra question-answering systems retrieval-augmented generation |
| title | Natural language access point to digital metal–organic polyhedra chemistry in The World Avatar |
| title_full | Natural language access point to digital metal–organic polyhedra chemistry in The World Avatar |
| title_fullStr | Natural language access point to digital metal–organic polyhedra chemistry in The World Avatar |
| title_full_unstemmed | Natural language access point to digital metal–organic polyhedra chemistry in The World Avatar |
| title_short | Natural language access point to digital metal–organic polyhedra chemistry in The World Avatar |
| title_sort | natural language access point to digital metal organic polyhedra chemistry in the world avatar |
| topic | dynamic knowledge graphs metal–organic polyhedra question-answering systems retrieval-augmented generation |
| url | https://www.cambridge.org/core/product/identifier/S2632673625000127/type/journal_article |
| work_keys_str_mv | AT simondrihm naturallanguageaccesspointtodigitalmetalorganicpolyhedrachemistryintheworldavatar AT danntran naturallanguageaccesspointtodigitalmetalorganicpolyhedrachemistryintheworldavatar AT aleksandarkondinski naturallanguageaccesspointtodigitalmetalorganicpolyhedrachemistryintheworldavatar AT laurapascazio naturallanguageaccesspointtodigitalmetalorganicpolyhedrachemistryintheworldavatar AT fabiosaluz naturallanguageaccesspointtodigitalmetalorganicpolyhedrachemistryintheworldavatar AT xinhongdeng naturallanguageaccesspointtodigitalmetalorganicpolyhedrachemistryintheworldavatar AT sebastianmosbach naturallanguageaccesspointtodigitalmetalorganicpolyhedrachemistryintheworldavatar AT jethroakroyd naturallanguageaccesspointtodigitalmetalorganicpolyhedrachemistryintheworldavatar AT markuskraft naturallanguageaccesspointtodigitalmetalorganicpolyhedrachemistryintheworldavatar |