On convex decision regions in deep network representations

Abstract Current work on human-machine alignment aims at understanding machine-learned latent spaces and their relations to human representations. We study the convexity of concept regions in machine-learned latent spaces, inspired by Gärdenfors’ conceptual spaces. In cognitive science, convexity is...

Full description

Saved in:
Bibliographic Details
Main Authors: Lenka Tětková, Thea Brüsch, Teresa Dorszewski, Fabian Martin Mager, Rasmus Ørtoft Aagaard, Jonathan Foldager, Tommy Sonne Alstrøm, Lars Kai Hansen
Format: Article
Language:English
Published: Nature Portfolio 2025-07-01
Series:Nature Communications
Online Access:https://doi.org/10.1038/s41467-025-60809-y
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849769092884463616
author Lenka Tětková
Thea Brüsch
Teresa Dorszewski
Fabian Martin Mager
Rasmus Ørtoft Aagaard
Jonathan Foldager
Tommy Sonne Alstrøm
Lars Kai Hansen
author_facet Lenka Tětková
Thea Brüsch
Teresa Dorszewski
Fabian Martin Mager
Rasmus Ørtoft Aagaard
Jonathan Foldager
Tommy Sonne Alstrøm
Lars Kai Hansen
author_sort Lenka Tětková
collection DOAJ
description Abstract Current work on human-machine alignment aims at understanding machine-learned latent spaces and their relations to human representations. We study the convexity of concept regions in machine-learned latent spaces, inspired by Gärdenfors’ conceptual spaces. In cognitive science, convexity is found to support generalization, few-shot learning, and interpersonal alignment. We develop tools to measure convexity in sampled data and evaluate it across layers of state-of-the-art deep networks. We show that convexity is robust to relevant latent space transformations and, hence, meaningful as a quality of machine-learned latent spaces. We find pervasive approximate convexity across domains, including image, text, audio, human activity, and medical data. Fine-tuning generally increases convexity, and the level of convexity of class label regions in pretrained models predicts subsequent fine-tuning performance. Our framework allows investigation of layered latent representations and offers new insights into learning mechanisms, human-machine alignment, and potential improvements in model generalization.
format Article
id doaj-art-c7bc035083304c51ace56f3b146ab7bc
institution DOAJ
issn 2041-1723
language English
publishDate 2025-07-01
publisher Nature Portfolio
record_format Article
series Nature Communications
spelling doaj-art-c7bc035083304c51ace56f3b146ab7bc2025-08-20T03:03:34ZengNature PortfolioNature Communications2041-17232025-07-0116111210.1038/s41467-025-60809-yOn convex decision regions in deep network representationsLenka Tětková0Thea Brüsch1Teresa Dorszewski2Fabian Martin Mager3Rasmus Ørtoft Aagaard4Jonathan Foldager5Tommy Sonne Alstrøm6Lars Kai Hansen7Section for Cognitive Systems, DTU Compute, Technical University of DenmarkSection for Cognitive Systems, DTU Compute, Technical University of DenmarkSection for Cognitive Systems, DTU Compute, Technical University of DenmarkSection for Cognitive Systems, DTU Compute, Technical University of DenmarkSection for Cognitive Systems, DTU Compute, Technical University of DenmarkSection for Cognitive Systems, DTU Compute, Technical University of DenmarkSection for Cognitive Systems, DTU Compute, Technical University of DenmarkSection for Cognitive Systems, DTU Compute, Technical University of DenmarkAbstract Current work on human-machine alignment aims at understanding machine-learned latent spaces and their relations to human representations. We study the convexity of concept regions in machine-learned latent spaces, inspired by Gärdenfors’ conceptual spaces. In cognitive science, convexity is found to support generalization, few-shot learning, and interpersonal alignment. We develop tools to measure convexity in sampled data and evaluate it across layers of state-of-the-art deep networks. We show that convexity is robust to relevant latent space transformations and, hence, meaningful as a quality of machine-learned latent spaces. We find pervasive approximate convexity across domains, including image, text, audio, human activity, and medical data. Fine-tuning generally increases convexity, and the level of convexity of class label regions in pretrained models predicts subsequent fine-tuning performance. Our framework allows investigation of layered latent representations and offers new insights into learning mechanisms, human-machine alignment, and potential improvements in model generalization.https://doi.org/10.1038/s41467-025-60809-y
spellingShingle Lenka Tětková
Thea Brüsch
Teresa Dorszewski
Fabian Martin Mager
Rasmus Ørtoft Aagaard
Jonathan Foldager
Tommy Sonne Alstrøm
Lars Kai Hansen
On convex decision regions in deep network representations
Nature Communications
title On convex decision regions in deep network representations
title_full On convex decision regions in deep network representations
title_fullStr On convex decision regions in deep network representations
title_full_unstemmed On convex decision regions in deep network representations
title_short On convex decision regions in deep network representations
title_sort on convex decision regions in deep network representations
url https://doi.org/10.1038/s41467-025-60809-y
work_keys_str_mv AT lenkatetkova onconvexdecisionregionsindeepnetworkrepresentations
AT theabrusch onconvexdecisionregionsindeepnetworkrepresentations
AT teresadorszewski onconvexdecisionregionsindeepnetworkrepresentations
AT fabianmartinmager onconvexdecisionregionsindeepnetworkrepresentations
AT rasmusørtoftaagaard onconvexdecisionregionsindeepnetworkrepresentations
AT jonathanfoldager onconvexdecisionregionsindeepnetworkrepresentations
AT tommysonnealstrøm onconvexdecisionregionsindeepnetworkrepresentations
AT larskaihansen onconvexdecisionregionsindeepnetworkrepresentations