A versatile information retrieval framework for evaluating profile strength and similarity

Abstract Large-scale profiling assays capture a cell population’s state by measuring thousands of biological properties per cell or sample. However, evaluating profile strength and similarity remains challenging due to the high dimensionality and non-linear, heterogeneous nature of measurements. Her...

Full description

Saved in:
Bibliographic Details
Main Authors: Alexandr A. Kalinin, John Arevalo, Erik Serrano, Loan Vulliard, Hillary Tsang, Michael Bornholdt, Alán F. Muñoz, Suganya Sivagurunathan, Bartek Rajwa, Anne E. Carpenter, Gregory P. Way, Shantanu Singh
Format: Article
Language:English
Published: Nature Portfolio 2025-06-01
Series:Nature Communications
Online Access:https://doi.org/10.1038/s41467-025-60306-2
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850137994911023104
author Alexandr A. Kalinin
John Arevalo
Erik Serrano
Loan Vulliard
Hillary Tsang
Michael Bornholdt
Alán F. Muñoz
Suganya Sivagurunathan
Bartek Rajwa
Anne E. Carpenter
Gregory P. Way
Shantanu Singh
author_facet Alexandr A. Kalinin
John Arevalo
Erik Serrano
Loan Vulliard
Hillary Tsang
Michael Bornholdt
Alán F. Muñoz
Suganya Sivagurunathan
Bartek Rajwa
Anne E. Carpenter
Gregory P. Way
Shantanu Singh
author_sort Alexandr A. Kalinin
collection DOAJ
description Abstract Large-scale profiling assays capture a cell population’s state by measuring thousands of biological properties per cell or sample. However, evaluating profile strength and similarity remains challenging due to the high dimensionality and non-linear, heterogeneous nature of measurements. Here, we develop a statistical framework using mean average precision (mAP) as a single, data-driven metric to address this challenge. We validate the mAP framework against established metrics through simulations and real-world data, revealing its ability to capture subtle and meaningful biological differences in cell state. Specifically, we use mAP to assess a sample’s phenotypic activity relative to controls, as well as the phenotypic consistency of groups of perturbations (or samples). We evaluate the framework across diverse datasets and on different profile types (image, protein, mRNA), perturbations (CRISPR, gene overexpression, small molecules), and resolutions (single-cell, bulk). The mAP framework, together with our open-source software package copairs, is useful for evaluating high-dimensional profiling data in biological research and drug discovery.
format Article
id doaj-art-459bc024d8e546809ac1fd32ecfa30c5
institution OA Journals
issn 2041-1723
language English
publishDate 2025-06-01
publisher Nature Portfolio
record_format Article
series Nature Communications
spelling doaj-art-459bc024d8e546809ac1fd32ecfa30c52025-08-20T02:30:42ZengNature PortfolioNature Communications2041-17232025-06-0116111710.1038/s41467-025-60306-2A versatile information retrieval framework for evaluating profile strength and similarityAlexandr A. Kalinin0John Arevalo1Erik Serrano2Loan Vulliard3Hillary Tsang4Michael Bornholdt5Alán F. Muñoz6Suganya Sivagurunathan7Bartek Rajwa8Anne E. Carpenter9Gregory P. Way10Shantanu Singh11Imaging Platform, Broad Institute of MIT and HarvardImaging Platform, Broad Institute of MIT and HarvardDepartment of Biomedical Informatics, University of Colorado School of MedicineSystems Immunology and Single-Cell Biology, German Cancer Research Center (DKFZ)Imaging Platform, Broad Institute of MIT and HarvardImaging Platform, Broad Institute of MIT and HarvardImaging Platform, Broad Institute of MIT and HarvardImaging Platform, Broad Institute of MIT and HarvardBindley Bioscience Center, Purdue UniversityImaging Platform, Broad Institute of MIT and HarvardDepartment of Biomedical Informatics, University of Colorado School of MedicineImaging Platform, Broad Institute of MIT and HarvardAbstract Large-scale profiling assays capture a cell population’s state by measuring thousands of biological properties per cell or sample. However, evaluating profile strength and similarity remains challenging due to the high dimensionality and non-linear, heterogeneous nature of measurements. Here, we develop a statistical framework using mean average precision (mAP) as a single, data-driven metric to address this challenge. We validate the mAP framework against established metrics through simulations and real-world data, revealing its ability to capture subtle and meaningful biological differences in cell state. Specifically, we use mAP to assess a sample’s phenotypic activity relative to controls, as well as the phenotypic consistency of groups of perturbations (or samples). We evaluate the framework across diverse datasets and on different profile types (image, protein, mRNA), perturbations (CRISPR, gene overexpression, small molecules), and resolutions (single-cell, bulk). The mAP framework, together with our open-source software package copairs, is useful for evaluating high-dimensional profiling data in biological research and drug discovery.https://doi.org/10.1038/s41467-025-60306-2
spellingShingle Alexandr A. Kalinin
John Arevalo
Erik Serrano
Loan Vulliard
Hillary Tsang
Michael Bornholdt
Alán F. Muñoz
Suganya Sivagurunathan
Bartek Rajwa
Anne E. Carpenter
Gregory P. Way
Shantanu Singh
A versatile information retrieval framework for evaluating profile strength and similarity
Nature Communications
title A versatile information retrieval framework for evaluating profile strength and similarity
title_full A versatile information retrieval framework for evaluating profile strength and similarity
title_fullStr A versatile information retrieval framework for evaluating profile strength and similarity
title_full_unstemmed A versatile information retrieval framework for evaluating profile strength and similarity
title_short A versatile information retrieval framework for evaluating profile strength and similarity
title_sort versatile information retrieval framework for evaluating profile strength and similarity
url https://doi.org/10.1038/s41467-025-60306-2
work_keys_str_mv AT alexandrakalinin aversatileinformationretrievalframeworkforevaluatingprofilestrengthandsimilarity
AT johnarevalo aversatileinformationretrievalframeworkforevaluatingprofilestrengthandsimilarity
AT erikserrano aversatileinformationretrievalframeworkforevaluatingprofilestrengthandsimilarity
AT loanvulliard aversatileinformationretrievalframeworkforevaluatingprofilestrengthandsimilarity
AT hillarytsang aversatileinformationretrievalframeworkforevaluatingprofilestrengthandsimilarity
AT michaelbornholdt aversatileinformationretrievalframeworkforevaluatingprofilestrengthandsimilarity
AT alanfmunoz aversatileinformationretrievalframeworkforevaluatingprofilestrengthandsimilarity
AT suganyasivagurunathan aversatileinformationretrievalframeworkforevaluatingprofilestrengthandsimilarity
AT bartekrajwa aversatileinformationretrievalframeworkforevaluatingprofilestrengthandsimilarity
AT anneecarpenter aversatileinformationretrievalframeworkforevaluatingprofilestrengthandsimilarity
AT gregorypway aversatileinformationretrievalframeworkforevaluatingprofilestrengthandsimilarity
AT shantanusingh aversatileinformationretrievalframeworkforevaluatingprofilestrengthandsimilarity
AT alexandrakalinin versatileinformationretrievalframeworkforevaluatingprofilestrengthandsimilarity
AT johnarevalo versatileinformationretrievalframeworkforevaluatingprofilestrengthandsimilarity
AT erikserrano versatileinformationretrievalframeworkforevaluatingprofilestrengthandsimilarity
AT loanvulliard versatileinformationretrievalframeworkforevaluatingprofilestrengthandsimilarity
AT hillarytsang versatileinformationretrievalframeworkforevaluatingprofilestrengthandsimilarity
AT michaelbornholdt versatileinformationretrievalframeworkforevaluatingprofilestrengthandsimilarity
AT alanfmunoz versatileinformationretrievalframeworkforevaluatingprofilestrengthandsimilarity
AT suganyasivagurunathan versatileinformationretrievalframeworkforevaluatingprofilestrengthandsimilarity
AT bartekrajwa versatileinformationretrievalframeworkforevaluatingprofilestrengthandsimilarity
AT anneecarpenter versatileinformationretrievalframeworkforevaluatingprofilestrengthandsimilarity
AT gregorypway versatileinformationretrievalframeworkforevaluatingprofilestrengthandsimilarity
AT shantanusingh versatileinformationretrievalframeworkforevaluatingprofilestrengthandsimilarity