Cross modality learning of cell painting and transcriptomics data improves mechanism of action clustering and bioactivity modelling

Abstract In drug discovery, different data modalities (chemical structure, cell biology, quantum mechanics, etc.) are abundant, and their integration can help with understanding aspects of chemistry, biology, and their interactions. Within cell biology, cell painting (CP) and transcriptomics RNA-Seq...

Full description

Saved in:
Bibliographic Details
Main Authors: Son V. Ha, Steffen Jaensch, Maciej M. Kańduła, Dorota Herman, Paul Czodrowski, Hugo Ceulemans
Format: Article
Language:English
Published: Nature Portfolio 2025-07-01
Series:Scientific Reports
Online Access:https://doi.org/10.1038/s41598-025-05914-0
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849399908990189568
author Son V. Ha
Steffen Jaensch
Maciej M. Kańduła
Dorota Herman
Paul Czodrowski
Hugo Ceulemans
author_facet Son V. Ha
Steffen Jaensch
Maciej M. Kańduła
Dorota Herman
Paul Czodrowski
Hugo Ceulemans
author_sort Son V. Ha
collection DOAJ
description Abstract In drug discovery, different data modalities (chemical structure, cell biology, quantum mechanics, etc.) are abundant, and their integration can help with understanding aspects of chemistry, biology, and their interactions. Within cell biology, cell painting (CP) and transcriptomics RNA-Seq (TX) screens are powerful tools in early drug discovery, as they are complementary views of the biological effect of compounds on a population of cells post-treatment. While multimodal learning of chemical structure-cell painting, or different omics data has been experimented; a cell painting-bulk transcriptomics multimodal model is still unexplored. In this work, we benchmark two representation learning methods: contrastive learning and bimodal autoencoder. We use the setting of cross modality learning where representation learning is performed with two modalities (CP and TX), but only cell painting is available for new compounds embeddings generation and downstream task. This is because for new compounds, we would only have CP data and not TX, due to high data generation cost of the RNA-Seq screen. We show that in the absence of TX features for new compounds, using learned embeddings like those obtained from Constrastive Learning enhances performance of CP features on tasks where TX features excels but CP features does not. Additionally, we observed that learned representation improves cluster quality for clustering of CP replicates and different mechanisms of action (MoA), as well as improves performance on several subsets of bioactivity tasks grouped by protein target families.
format Article
id doaj-art-70d1bd14d5c74aa2bd118dbbb00851ec
institution Kabale University
issn 2045-2322
language English
publishDate 2025-07-01
publisher Nature Portfolio
record_format Article
series Scientific Reports
spelling doaj-art-70d1bd14d5c74aa2bd118dbbb00851ec2025-08-20T03:38:13ZengNature PortfolioScientific Reports2045-23222025-07-0115111110.1038/s41598-025-05914-0Cross modality learning of cell painting and transcriptomics data improves mechanism of action clustering and bioactivity modellingSon V. Ha0Steffen Jaensch1Maciej M. Kańduła2Dorota Herman3Paul Czodrowski4Hugo Ceulemans5Johnson & JohnsonJohnson & JohnsonJohnson & JohnsonJohnson & JohnsonDepartment of Chemistry, Johannes Gutenberg University MainzJohnson & JohnsonAbstract In drug discovery, different data modalities (chemical structure, cell biology, quantum mechanics, etc.) are abundant, and their integration can help with understanding aspects of chemistry, biology, and their interactions. Within cell biology, cell painting (CP) and transcriptomics RNA-Seq (TX) screens are powerful tools in early drug discovery, as they are complementary views of the biological effect of compounds on a population of cells post-treatment. While multimodal learning of chemical structure-cell painting, or different omics data has been experimented; a cell painting-bulk transcriptomics multimodal model is still unexplored. In this work, we benchmark two representation learning methods: contrastive learning and bimodal autoencoder. We use the setting of cross modality learning where representation learning is performed with two modalities (CP and TX), but only cell painting is available for new compounds embeddings generation and downstream task. This is because for new compounds, we would only have CP data and not TX, due to high data generation cost of the RNA-Seq screen. We show that in the absence of TX features for new compounds, using learned embeddings like those obtained from Constrastive Learning enhances performance of CP features on tasks where TX features excels but CP features does not. Additionally, we observed that learned representation improves cluster quality for clustering of CP replicates and different mechanisms of action (MoA), as well as improves performance on several subsets of bioactivity tasks grouped by protein target families.https://doi.org/10.1038/s41598-025-05914-0
spellingShingle Son V. Ha
Steffen Jaensch
Maciej M. Kańduła
Dorota Herman
Paul Czodrowski
Hugo Ceulemans
Cross modality learning of cell painting and transcriptomics data improves mechanism of action clustering and bioactivity modelling
Scientific Reports
title Cross modality learning of cell painting and transcriptomics data improves mechanism of action clustering and bioactivity modelling
title_full Cross modality learning of cell painting and transcriptomics data improves mechanism of action clustering and bioactivity modelling
title_fullStr Cross modality learning of cell painting and transcriptomics data improves mechanism of action clustering and bioactivity modelling
title_full_unstemmed Cross modality learning of cell painting and transcriptomics data improves mechanism of action clustering and bioactivity modelling
title_short Cross modality learning of cell painting and transcriptomics data improves mechanism of action clustering and bioactivity modelling
title_sort cross modality learning of cell painting and transcriptomics data improves mechanism of action clustering and bioactivity modelling
url https://doi.org/10.1038/s41598-025-05914-0
work_keys_str_mv AT sonvha crossmodalitylearningofcellpaintingandtranscriptomicsdataimprovesmechanismofactionclusteringandbioactivitymodelling
AT steffenjaensch crossmodalitylearningofcellpaintingandtranscriptomicsdataimprovesmechanismofactionclusteringandbioactivitymodelling
AT maciejmkanduła crossmodalitylearningofcellpaintingandtranscriptomicsdataimprovesmechanismofactionclusteringandbioactivitymodelling
AT dorotaherman crossmodalitylearningofcellpaintingandtranscriptomicsdataimprovesmechanismofactionclusteringandbioactivitymodelling
AT paulczodrowski crossmodalitylearningofcellpaintingandtranscriptomicsdataimprovesmechanismofactionclusteringandbioactivitymodelling
AT hugoceulemans crossmodalitylearningofcellpaintingandtranscriptomicsdataimprovesmechanismofactionclusteringandbioactivitymodelling