Text2Layout: Layout Generation From Text Representation Using Transformer
Recent advanced Text-to-Image methods still require much work to specify all labels and detailed layouts of objects to obtain an accurate planned image. Layout-based synthesis is an alternative method enabling users to control detailed composition directly to avoid the trial-and-error often required...
Saved in:
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2024-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/10663081/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1841533395097616384 |
---|---|
author | Haruka Takahashi Shigeru Kuriyama |
author_facet | Haruka Takahashi Shigeru Kuriyama |
author_sort | Haruka Takahashi |
collection | DOAJ |
description | Recent advanced Text-to-Image methods still require much work to specify all labels and detailed layouts of objects to obtain an accurate planned image. Layout-based synthesis is an alternative method enabling users to control detailed composition directly to avoid the trial-and-error often required in prompt-based editing. This paper proposes a layout generation from text instead of generating images directly. Our approach uses Transformer-based deep neural networks to synthesize scene representations of multiple objects. By focusing on layout information, we can make an explainable layout of what objects the image includes. Our end-to-end approach uses parallel decoding, differs from conventional layout synthesis, and introduces sequential object predictions and post-processing of duplicate bounding boxes. We experimentally compare our method’s quality and computational cost against existing ones, demonstrating its effectiveness and efficiency in generating layouts from textual representations. Combined with Layout-to-Image, this approach has significant practical implications, allowing the practical authoring tools that make image generation explainable and computable using relatively lightweight networks. |
format | Article |
id | doaj-art-313028909967407191f841374a47831e |
institution | Kabale University |
issn | 2169-3536 |
language | English |
publishDate | 2024-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj-art-313028909967407191f841374a47831e2025-01-16T00:02:14ZengIEEEIEEE Access2169-35362024-01-011213631913632810.1109/ACCESS.2024.345295710663081Text2Layout: Layout Generation From Text Representation Using TransformerHaruka Takahashi0https://orcid.org/0009-0008-8490-0968Shigeru Kuriyama1https://orcid.org/0000-0001-5551-8112Department of Computer Science and Engineering, Toyohashi University of Technology, Toyohashi, JapanDepartment of Computer Science and Engineering, Toyohashi University of Technology, Toyohashi, JapanRecent advanced Text-to-Image methods still require much work to specify all labels and detailed layouts of objects to obtain an accurate planned image. Layout-based synthesis is an alternative method enabling users to control detailed composition directly to avoid the trial-and-error often required in prompt-based editing. This paper proposes a layout generation from text instead of generating images directly. Our approach uses Transformer-based deep neural networks to synthesize scene representations of multiple objects. By focusing on layout information, we can make an explainable layout of what objects the image includes. Our end-to-end approach uses parallel decoding, differs from conventional layout synthesis, and introduces sequential object predictions and post-processing of duplicate bounding boxes. We experimentally compare our method’s quality and computational cost against existing ones, demonstrating its effectiveness and efficiency in generating layouts from textual representations. Combined with Layout-to-Image, this approach has significant practical implications, allowing the practical authoring tools that make image generation explainable and computable using relatively lightweight networks.https://ieeexplore.ieee.org/document/10663081/Text-to-imagelayout generationcreation supporttransformer |
spellingShingle | Haruka Takahashi Shigeru Kuriyama Text2Layout: Layout Generation From Text Representation Using Transformer IEEE Access Text-to-image layout generation creation support transformer |
title | Text2Layout: Layout Generation From Text Representation Using Transformer |
title_full | Text2Layout: Layout Generation From Text Representation Using Transformer |
title_fullStr | Text2Layout: Layout Generation From Text Representation Using Transformer |
title_full_unstemmed | Text2Layout: Layout Generation From Text Representation Using Transformer |
title_short | Text2Layout: Layout Generation From Text Representation Using Transformer |
title_sort | text2layout layout generation from text representation using transformer |
topic | Text-to-image layout generation creation support transformer |
url | https://ieeexplore.ieee.org/document/10663081/ |
work_keys_str_mv | AT harukatakahashi text2layoutlayoutgenerationfromtextrepresentationusingtransformer AT shigerukuriyama text2layoutlayoutgenerationfromtextrepresentationusingtransformer |