Text2Layout: Layout Generation From Text Representation Using Transformer

Recent advanced Text-to-Image methods still require much work to specify all labels and detailed layouts of objects to obtain an accurate planned image. Layout-based synthesis is an alternative method enabling users to control detailed composition directly to avoid the trial-and-error often required...

Full description

Saved in:
Bibliographic Details
Main Authors: Haruka Takahashi, Shigeru Kuriyama
Format: Article
Language:English
Published: IEEE 2024-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10663081/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1841533395097616384
author Haruka Takahashi
Shigeru Kuriyama
author_facet Haruka Takahashi
Shigeru Kuriyama
author_sort Haruka Takahashi
collection DOAJ
description Recent advanced Text-to-Image methods still require much work to specify all labels and detailed layouts of objects to obtain an accurate planned image. Layout-based synthesis is an alternative method enabling users to control detailed composition directly to avoid the trial-and-error often required in prompt-based editing. This paper proposes a layout generation from text instead of generating images directly. Our approach uses Transformer-based deep neural networks to synthesize scene representations of multiple objects. By focusing on layout information, we can make an explainable layout of what objects the image includes. Our end-to-end approach uses parallel decoding, differs from conventional layout synthesis, and introduces sequential object predictions and post-processing of duplicate bounding boxes. We experimentally compare our method’s quality and computational cost against existing ones, demonstrating its effectiveness and efficiency in generating layouts from textual representations. Combined with Layout-to-Image, this approach has significant practical implications, allowing the practical authoring tools that make image generation explainable and computable using relatively lightweight networks.
format Article
id doaj-art-313028909967407191f841374a47831e
institution Kabale University
issn 2169-3536
language English
publishDate 2024-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj-art-313028909967407191f841374a47831e2025-01-16T00:02:14ZengIEEEIEEE Access2169-35362024-01-011213631913632810.1109/ACCESS.2024.345295710663081Text2Layout: Layout Generation From Text Representation Using TransformerHaruka Takahashi0https://orcid.org/0009-0008-8490-0968Shigeru Kuriyama1https://orcid.org/0000-0001-5551-8112Department of Computer Science and Engineering, Toyohashi University of Technology, Toyohashi, JapanDepartment of Computer Science and Engineering, Toyohashi University of Technology, Toyohashi, JapanRecent advanced Text-to-Image methods still require much work to specify all labels and detailed layouts of objects to obtain an accurate planned image. Layout-based synthesis is an alternative method enabling users to control detailed composition directly to avoid the trial-and-error often required in prompt-based editing. This paper proposes a layout generation from text instead of generating images directly. Our approach uses Transformer-based deep neural networks to synthesize scene representations of multiple objects. By focusing on layout information, we can make an explainable layout of what objects the image includes. Our end-to-end approach uses parallel decoding, differs from conventional layout synthesis, and introduces sequential object predictions and post-processing of duplicate bounding boxes. We experimentally compare our method’s quality and computational cost against existing ones, demonstrating its effectiveness and efficiency in generating layouts from textual representations. Combined with Layout-to-Image, this approach has significant practical implications, allowing the practical authoring tools that make image generation explainable and computable using relatively lightweight networks.https://ieeexplore.ieee.org/document/10663081/Text-to-imagelayout generationcreation supporttransformer
spellingShingle Haruka Takahashi
Shigeru Kuriyama
Text2Layout: Layout Generation From Text Representation Using Transformer
IEEE Access
Text-to-image
layout generation
creation support
transformer
title Text2Layout: Layout Generation From Text Representation Using Transformer
title_full Text2Layout: Layout Generation From Text Representation Using Transformer
title_fullStr Text2Layout: Layout Generation From Text Representation Using Transformer
title_full_unstemmed Text2Layout: Layout Generation From Text Representation Using Transformer
title_short Text2Layout: Layout Generation From Text Representation Using Transformer
title_sort text2layout layout generation from text representation using transformer
topic Text-to-image
layout generation
creation support
transformer
url https://ieeexplore.ieee.org/document/10663081/
work_keys_str_mv AT harukatakahashi text2layoutlayoutgenerationfromtextrepresentationusingtransformer
AT shigerukuriyama text2layoutlayoutgenerationfromtextrepresentationusingtransformer