A methodology for designing accurate, modifiable and reproducible scientific graphics in environmental studies using GPT4Designer

Abstract The integration of text-to-image generation capabilities within GPT-4 allows for the convenient creation of various graphics. However, the proficiency of GPT-4 in crafting challenging scientific visuals remains largely unexplored. In this study, we conduct systematic experiments by employin...

Full description

Saved in:
Bibliographic Details
Main Authors: Jingsi Gao, Yubo Shi, Ruoyu Wang, Jianfeng Zhou
Format: Article
Language:English
Published: Nature Portfolio 2025-07-01
Series:Scientific Reports
Online Access:https://doi.org/10.1038/s41598-025-00300-2
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Abstract The integration of text-to-image generation capabilities within GPT-4 allows for the convenient creation of various graphics. However, the proficiency of GPT-4 in crafting challenging scientific visuals remains largely unexplored. In this study, we conduct systematic experiments by employing multiple prompt engineering techniques with various supplementary materials to generate complex scientific illustrations for environmental studies. The locally enhanced electric field treatment for water disinfection is used as an example to illustrate the universal reflection of GPT-4 in graphic creation. From the experiments, we summarize that the existing prompt methods struggle in accuracy, modifiability, and reproducibility for scientific image generation. Based on the findings and insights drawn from the extensive experimental results, we develop GPT4Designer, a framework intended to generate scientific images without tedious prompt modifications. Specifically, a simple but surprisingly effective “envision-first” strategy by combining detailed prompting and guided envisioning is developed in the GPT4Designer framework. This strategy yields images with consistent styles aligned with the initial envisioning, significantly improving modifiability. Besides, by refining the conceptualization phase, we achieve much better control over the output, resulting in both high accuracy and reproducibility. This advancement is not only crucial for environmental scientists seeking to quickly produce engaging and accurate visuals (e.g., with only one step), but also demonstrates the existence “chain-of-thought” in image generation, which can inspire more works on the creative application of text-to-image generation models or tools.
ISSN:2045-2322