Two antagonistic objectives for one multi-scale graph clustering framework
Abstract In the current state of knowledge, there is no consensus on an objective criterion for evaluating network communities as cohesive sets of nodes with the following two properties: $$\textbf{P}_{\textbf{DC}}:$$ Each community is Densely Connected; $$\textbf{P}_{\textbf{WC}}:$$ Communities are...
Saved in:
| Main Authors: | , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Nature Portfolio
2025-04-01
|
| Series: | Scientific Reports |
| Online Access: | https://doi.org/10.1038/s41598-025-90454-w |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Abstract In the current state of knowledge, there is no consensus on an objective criterion for evaluating network communities as cohesive sets of nodes with the following two properties: $$\textbf{P}_{\textbf{DC}}:$$ Each community is Densely Connected; $$\textbf{P}_{\textbf{WC}}:$$ Communities are Weakly Connected to each other. This makes it difficult to conduct comparative studies between dozens of graph clustering methods proposed over more than 20 years. To fill this gap: We propose a graph clustering framework by faithfully formalizing $$\textbf{P}_{DC}$$ with precision and $$\textbf{P}_{WC}$$ with recall, which are two meaningful metrics, simple, well known and already widely used for many tasks in most sciences. The meaning of these metrics in the context of graph clustering is therefore easily interpretable by most users of real-world graphs. We show that for most graphs, these two metrics are antagonistic, i.e. there is no solution that simultaneously maximizes precision and recall. In other words, to select a clustering among the Pareto optimal solutions (clusterings such that no other clustering exist that both increases the precision and the recall) we must first make a subjective compromise, according to our needs between the two properties $$\textbf{P}_{DC}$$ and $$\textbf{P}_{WC}$$ . We then show how to use this framework to compare, even without ‘ground truth’, the performances of five hitherto incommensurable state-of-the-art clustering methods, as well as that of a new family of clustering methods inspired by our approach. |
|---|---|
| ISSN: | 2045-2322 |