Teaching deep networks to see shape: Lessons from a simplified visual world.

Deep neural networks have been remarkably successful as models of the primate visual system. One crucial problem is that they fail to account for the strong shape-dependence of primate vision. Whereas humans base their judgements of category membership to a large extent on shape, deep networks rely...

Full description

Saved in:
Bibliographic Details
Main Authors: Christian Jarvers, Heiko Neumann
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2024-11-01
Series:PLoS Computational Biology
Online Access:https://doi.org/10.1371/journal.pcbi.1012019
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850066547298533376
author Christian Jarvers
Heiko Neumann
author_facet Christian Jarvers
Heiko Neumann
author_sort Christian Jarvers
collection DOAJ
description Deep neural networks have been remarkably successful as models of the primate visual system. One crucial problem is that they fail to account for the strong shape-dependence of primate vision. Whereas humans base their judgements of category membership to a large extent on shape, deep networks rely much more strongly on other features such as color and texture. While this problem has been widely documented, the underlying reasons remain unclear. We design simple, artificial image datasets in which shape, color, and texture features can be used to predict the image class. By training networks from scratch to classify images with single features and feature combinations, we show that some network architectures are unable to learn to use shape features, whereas others are able to use shape in principle but are biased towards the other features. We show that the bias can be explained by the interactions between the weight updates for many images in mini-batch gradient descent. This suggests that different learning algorithms with sparser, more local weight changes are required to make networks more sensitive to shape and improve their capability to describe human vision.
format Article
id doaj-art-a28187d4458c4f96a736178d2241bf85
institution DOAJ
issn 1553-734X
1553-7358
language English
publishDate 2024-11-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS Computational Biology
spelling doaj-art-a28187d4458c4f96a736178d2241bf852025-08-20T02:48:42ZengPublic Library of Science (PLoS)PLoS Computational Biology1553-734X1553-73582024-11-012011e101201910.1371/journal.pcbi.1012019Teaching deep networks to see shape: Lessons from a simplified visual world.Christian JarversHeiko NeumannDeep neural networks have been remarkably successful as models of the primate visual system. One crucial problem is that they fail to account for the strong shape-dependence of primate vision. Whereas humans base their judgements of category membership to a large extent on shape, deep networks rely much more strongly on other features such as color and texture. While this problem has been widely documented, the underlying reasons remain unclear. We design simple, artificial image datasets in which shape, color, and texture features can be used to predict the image class. By training networks from scratch to classify images with single features and feature combinations, we show that some network architectures are unable to learn to use shape features, whereas others are able to use shape in principle but are biased towards the other features. We show that the bias can be explained by the interactions between the weight updates for many images in mini-batch gradient descent. This suggests that different learning algorithms with sparser, more local weight changes are required to make networks more sensitive to shape and improve their capability to describe human vision.https://doi.org/10.1371/journal.pcbi.1012019
spellingShingle Christian Jarvers
Heiko Neumann
Teaching deep networks to see shape: Lessons from a simplified visual world.
PLoS Computational Biology
title Teaching deep networks to see shape: Lessons from a simplified visual world.
title_full Teaching deep networks to see shape: Lessons from a simplified visual world.
title_fullStr Teaching deep networks to see shape: Lessons from a simplified visual world.
title_full_unstemmed Teaching deep networks to see shape: Lessons from a simplified visual world.
title_short Teaching deep networks to see shape: Lessons from a simplified visual world.
title_sort teaching deep networks to see shape lessons from a simplified visual world
url https://doi.org/10.1371/journal.pcbi.1012019
work_keys_str_mv AT christianjarvers teachingdeepnetworkstoseeshapelessonsfromasimplifiedvisualworld
AT heikoneumann teachingdeepnetworkstoseeshapelessonsfromasimplifiedvisualworld