Leveraging two-dimensional pre-trained vision transformers for three-dimensional model generation via masked autoencoders
Abstract Although the Transformer architecture has established itself as the industry standard for jobs involving natural language processing, it still has few uses in computer vision. In vision, attention is used in conjunction with convolutional networks or to replace individual convolutional netw...
Saved in:
Main Authors: | , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Nature Portfolio
2025-01-01
|
Series: | Scientific Reports |
Subjects: | |
Online Access: | https://doi.org/10.1038/s41598-025-87376-y |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|