Agent-based multimodal information extraction for nanomaterials

Abstract Automating structured data extraction from scientific literature is a critical challenge with broad implications across domains. We introduce nanoMINER, a multi-agent system combining large language models and multimodal analysis to extract essential information from scientific research art...

Full description

Saved in:
Bibliographic Details
Main Authors: R. Odobesku, K. Romanova, S. Mirzaeva, O. Zagorulko, R. Sim, R. Khakimullin, J. Razlivina, A. Dmitrenko, V. Vinogradov
Format: Article
Language:English
Published: Nature Portfolio 2025-06-01
Series:npj Computational Materials
Online Access:https://doi.org/10.1038/s41524-025-01674-7
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850108818010144768
author R. Odobesku
K. Romanova
S. Mirzaeva
O. Zagorulko
R. Sim
R. Khakimullin
J. Razlivina
A. Dmitrenko
V. Vinogradov
author_facet R. Odobesku
K. Romanova
S. Mirzaeva
O. Zagorulko
R. Sim
R. Khakimullin
J. Razlivina
A. Dmitrenko
V. Vinogradov
author_sort R. Odobesku
collection DOAJ
description Abstract Automating structured data extraction from scientific literature is a critical challenge with broad implications across domains. We introduce nanoMINER, a multi-agent system combining large language models and multimodal analysis to extract essential information from scientific research articles on nanomaterials. This system processes documents end-to-end, utilizing tools such as YOLO for visual data extraction and GPT-4o for linking textual and visual information. At its core, the ReAct agent orchestrates specialized agents to ensure comprehensive data extraction. We demonstrate the efficacy of the system by automating the assembly of nanomaterial and nanozyme datasets previously manually curated by domain experts. NanoMINER achieves high precision in extracting nanomaterial properties like chemical formulas, crystal systems, and surface characteristics. For nanozymes, we obtain near-perfect precision (0.98) for kinetic parameters and essential features such as Cmin and Cmax. To benchmark the system performance, we also compare nanoMINER to several baseline LLMs, including the most recent multimodal GPT-4.1, and show consistently higher extraction precision and recall. Our approach is extensible to other domains of materials science and fields like biomedicine, advancing data-driven research methodologies and automated knowledge extraction.
format Article
id doaj-art-bfd235ff404a482f97c081f52c5db5b8
institution OA Journals
issn 2057-3960
language English
publishDate 2025-06-01
publisher Nature Portfolio
record_format Article
series npj Computational Materials
spelling doaj-art-bfd235ff404a482f97c081f52c5db5b82025-08-20T02:38:14ZengNature Portfolionpj Computational Materials2057-39602025-06-0111111110.1038/s41524-025-01674-7Agent-based multimodal information extraction for nanomaterialsR. Odobesku0K. Romanova1S. Mirzaeva2O. Zagorulko3R. Sim4R. Khakimullin5J. Razlivina6A. Dmitrenko7V. Vinogradov8AI Talent Hub, ITMO UniversityAI Talent Hub, ITMO UniversityMoscow State UniversityAI Talent Hub, ITMO UniversityAI Talent Hub, ITMO UniversityAI Talent Hub, ITMO UniversityCenter for AI in Chemistry, ITMO UniversityCenter for AI in Chemistry, ITMO UniversityCenter for AI in Chemistry, ITMO UniversityAbstract Automating structured data extraction from scientific literature is a critical challenge with broad implications across domains. We introduce nanoMINER, a multi-agent system combining large language models and multimodal analysis to extract essential information from scientific research articles on nanomaterials. This system processes documents end-to-end, utilizing tools such as YOLO for visual data extraction and GPT-4o for linking textual and visual information. At its core, the ReAct agent orchestrates specialized agents to ensure comprehensive data extraction. We demonstrate the efficacy of the system by automating the assembly of nanomaterial and nanozyme datasets previously manually curated by domain experts. NanoMINER achieves high precision in extracting nanomaterial properties like chemical formulas, crystal systems, and surface characteristics. For nanozymes, we obtain near-perfect precision (0.98) for kinetic parameters and essential features such as Cmin and Cmax. To benchmark the system performance, we also compare nanoMINER to several baseline LLMs, including the most recent multimodal GPT-4.1, and show consistently higher extraction precision and recall. Our approach is extensible to other domains of materials science and fields like biomedicine, advancing data-driven research methodologies and automated knowledge extraction.https://doi.org/10.1038/s41524-025-01674-7
spellingShingle R. Odobesku
K. Romanova
S. Mirzaeva
O. Zagorulko
R. Sim
R. Khakimullin
J. Razlivina
A. Dmitrenko
V. Vinogradov
Agent-based multimodal information extraction for nanomaterials
npj Computational Materials
title Agent-based multimodal information extraction for nanomaterials
title_full Agent-based multimodal information extraction for nanomaterials
title_fullStr Agent-based multimodal information extraction for nanomaterials
title_full_unstemmed Agent-based multimodal information extraction for nanomaterials
title_short Agent-based multimodal information extraction for nanomaterials
title_sort agent based multimodal information extraction for nanomaterials
url https://doi.org/10.1038/s41524-025-01674-7
work_keys_str_mv AT rodobesku agentbasedmultimodalinformationextractionfornanomaterials
AT kromanova agentbasedmultimodalinformationextractionfornanomaterials
AT smirzaeva agentbasedmultimodalinformationextractionfornanomaterials
AT ozagorulko agentbasedmultimodalinformationextractionfornanomaterials
AT rsim agentbasedmultimodalinformationextractionfornanomaterials
AT rkhakimullin agentbasedmultimodalinformationextractionfornanomaterials
AT jrazlivina agentbasedmultimodalinformationextractionfornanomaterials
AT admitrenko agentbasedmultimodalinformationextractionfornanomaterials
AT vvinogradov agentbasedmultimodalinformationextractionfornanomaterials