Two antagonistic objectives for one multi-scale graph clustering framework

Abstract In the current state of knowledge, there is no consensus on an objective criterion for evaluating network communities as cohesive sets of nodes with the following two properties: $$\textbf{P}_{\textbf{DC}}:$$ Each community is Densely Connected; $$\textbf{P}_{\textbf{WC}}:$$ Communities are...

Full description

Saved in:
Bibliographic Details
Main Authors: Bruno Gaume, Ixandra Achitouv, David Chavalarias
Format: Article
Language:English
Published: Nature Portfolio 2025-04-01
Series:Scientific Reports
Online Access:https://doi.org/10.1038/s41598-025-90454-w
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850146275990700032
author Bruno Gaume
Ixandra Achitouv
David Chavalarias
author_facet Bruno Gaume
Ixandra Achitouv
David Chavalarias
author_sort Bruno Gaume
collection DOAJ
description Abstract In the current state of knowledge, there is no consensus on an objective criterion for evaluating network communities as cohesive sets of nodes with the following two properties: $$\textbf{P}_{\textbf{DC}}:$$ Each community is Densely Connected; $$\textbf{P}_{\textbf{WC}}:$$ Communities are Weakly Connected to each other. This makes it difficult to conduct comparative studies between dozens of graph clustering methods proposed over more than 20 years. To fill this gap: We propose a graph clustering framework by faithfully formalizing $$\textbf{P}_{DC}$$ with precision and $$\textbf{P}_{WC}$$ with recall, which are two meaningful metrics, simple, well known and already widely used for many tasks in most sciences. The meaning of these metrics in the context of graph clustering is therefore easily interpretable by most users of real-world graphs. We show that for most graphs, these two metrics are antagonistic, i.e. there is no solution that simultaneously maximizes precision and recall. In other words, to select a clustering among the Pareto optimal solutions (clusterings such that no other clustering exist that both increases the precision and the recall) we must first make a subjective compromise, according to our needs between the two properties $$\textbf{P}_{DC}$$ and $$\textbf{P}_{WC}$$ . We then show how to use this framework to compare, even without ‘ground truth’, the performances of five hitherto incommensurable state-of-the-art clustering methods, as well as that of a new family of clustering methods inspired by our approach.
format Article
id doaj-art-3f506bac69b34fc58f4abf9803518436
institution OA Journals
issn 2045-2322
language English
publishDate 2025-04-01
publisher Nature Portfolio
record_format Article
series Scientific Reports
spelling doaj-art-3f506bac69b34fc58f4abf98035184362025-08-20T02:27:53ZengNature PortfolioScientific Reports2045-23222025-04-0115111710.1038/s41598-025-90454-wTwo antagonistic objectives for one multi-scale graph clustering frameworkBruno Gaume0Ixandra Achitouv1David Chavalarias2Cognition, Langues, Langage, Ergonomie (CLLE, UMR 5263), CNRSComplex Systems Institute of Paris île-de-France (ISC-PIF, UAR3611)Centre d’Analyse et de Mathématique Sociales (CAMS, UMR8557)Abstract In the current state of knowledge, there is no consensus on an objective criterion for evaluating network communities as cohesive sets of nodes with the following two properties: $$\textbf{P}_{\textbf{DC}}:$$ Each community is Densely Connected; $$\textbf{P}_{\textbf{WC}}:$$ Communities are Weakly Connected to each other. This makes it difficult to conduct comparative studies between dozens of graph clustering methods proposed over more than 20 years. To fill this gap: We propose a graph clustering framework by faithfully formalizing $$\textbf{P}_{DC}$$ with precision and $$\textbf{P}_{WC}$$ with recall, which are two meaningful metrics, simple, well known and already widely used for many tasks in most sciences. The meaning of these metrics in the context of graph clustering is therefore easily interpretable by most users of real-world graphs. We show that for most graphs, these two metrics are antagonistic, i.e. there is no solution that simultaneously maximizes precision and recall. In other words, to select a clustering among the Pareto optimal solutions (clusterings such that no other clustering exist that both increases the precision and the recall) we must first make a subjective compromise, according to our needs between the two properties $$\textbf{P}_{DC}$$ and $$\textbf{P}_{WC}$$ . We then show how to use this framework to compare, even without ‘ground truth’, the performances of five hitherto incommensurable state-of-the-art clustering methods, as well as that of a new family of clustering methods inspired by our approach.https://doi.org/10.1038/s41598-025-90454-w
spellingShingle Bruno Gaume
Ixandra Achitouv
David Chavalarias
Two antagonistic objectives for one multi-scale graph clustering framework
Scientific Reports
title Two antagonistic objectives for one multi-scale graph clustering framework
title_full Two antagonistic objectives for one multi-scale graph clustering framework
title_fullStr Two antagonistic objectives for one multi-scale graph clustering framework
title_full_unstemmed Two antagonistic objectives for one multi-scale graph clustering framework
title_short Two antagonistic objectives for one multi-scale graph clustering framework
title_sort two antagonistic objectives for one multi scale graph clustering framework
url https://doi.org/10.1038/s41598-025-90454-w
work_keys_str_mv AT brunogaume twoantagonisticobjectivesforonemultiscalegraphclusteringframework
AT ixandraachitouv twoantagonisticobjectivesforonemultiscalegraphclusteringframework
AT davidchavalarias twoantagonisticobjectivesforonemultiscalegraphclusteringframework