Cross modality learning of cell painting and transcriptomics data improves mechanism of action clustering and bioactivity modelling
Abstract In drug discovery, different data modalities (chemical structure, cell biology, quantum mechanics, etc.) are abundant, and their integration can help with understanding aspects of chemistry, biology, and their interactions. Within cell biology, cell painting (CP) and transcriptomics RNA-Seq...
Saved in:
| Main Authors: | , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Nature Portfolio
2025-07-01
|
| Series: | Scientific Reports |
| Online Access: | https://doi.org/10.1038/s41598-025-05914-0 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849399908990189568 |
|---|---|
| author | Son V. Ha Steffen Jaensch Maciej M. Kańduła Dorota Herman Paul Czodrowski Hugo Ceulemans |
| author_facet | Son V. Ha Steffen Jaensch Maciej M. Kańduła Dorota Herman Paul Czodrowski Hugo Ceulemans |
| author_sort | Son V. Ha |
| collection | DOAJ |
| description | Abstract In drug discovery, different data modalities (chemical structure, cell biology, quantum mechanics, etc.) are abundant, and their integration can help with understanding aspects of chemistry, biology, and their interactions. Within cell biology, cell painting (CP) and transcriptomics RNA-Seq (TX) screens are powerful tools in early drug discovery, as they are complementary views of the biological effect of compounds on a population of cells post-treatment. While multimodal learning of chemical structure-cell painting, or different omics data has been experimented; a cell painting-bulk transcriptomics multimodal model is still unexplored. In this work, we benchmark two representation learning methods: contrastive learning and bimodal autoencoder. We use the setting of cross modality learning where representation learning is performed with two modalities (CP and TX), but only cell painting is available for new compounds embeddings generation and downstream task. This is because for new compounds, we would only have CP data and not TX, due to high data generation cost of the RNA-Seq screen. We show that in the absence of TX features for new compounds, using learned embeddings like those obtained from Constrastive Learning enhances performance of CP features on tasks where TX features excels but CP features does not. Additionally, we observed that learned representation improves cluster quality for clustering of CP replicates and different mechanisms of action (MoA), as well as improves performance on several subsets of bioactivity tasks grouped by protein target families. |
| format | Article |
| id | doaj-art-70d1bd14d5c74aa2bd118dbbb00851ec |
| institution | Kabale University |
| issn | 2045-2322 |
| language | English |
| publishDate | 2025-07-01 |
| publisher | Nature Portfolio |
| record_format | Article |
| series | Scientific Reports |
| spelling | doaj-art-70d1bd14d5c74aa2bd118dbbb00851ec2025-08-20T03:38:13ZengNature PortfolioScientific Reports2045-23222025-07-0115111110.1038/s41598-025-05914-0Cross modality learning of cell painting and transcriptomics data improves mechanism of action clustering and bioactivity modellingSon V. Ha0Steffen Jaensch1Maciej M. Kańduła2Dorota Herman3Paul Czodrowski4Hugo Ceulemans5Johnson & JohnsonJohnson & JohnsonJohnson & JohnsonJohnson & JohnsonDepartment of Chemistry, Johannes Gutenberg University MainzJohnson & JohnsonAbstract In drug discovery, different data modalities (chemical structure, cell biology, quantum mechanics, etc.) are abundant, and their integration can help with understanding aspects of chemistry, biology, and their interactions. Within cell biology, cell painting (CP) and transcriptomics RNA-Seq (TX) screens are powerful tools in early drug discovery, as they are complementary views of the biological effect of compounds on a population of cells post-treatment. While multimodal learning of chemical structure-cell painting, or different omics data has been experimented; a cell painting-bulk transcriptomics multimodal model is still unexplored. In this work, we benchmark two representation learning methods: contrastive learning and bimodal autoencoder. We use the setting of cross modality learning where representation learning is performed with two modalities (CP and TX), but only cell painting is available for new compounds embeddings generation and downstream task. This is because for new compounds, we would only have CP data and not TX, due to high data generation cost of the RNA-Seq screen. We show that in the absence of TX features for new compounds, using learned embeddings like those obtained from Constrastive Learning enhances performance of CP features on tasks where TX features excels but CP features does not. Additionally, we observed that learned representation improves cluster quality for clustering of CP replicates and different mechanisms of action (MoA), as well as improves performance on several subsets of bioactivity tasks grouped by protein target families.https://doi.org/10.1038/s41598-025-05914-0 |
| spellingShingle | Son V. Ha Steffen Jaensch Maciej M. Kańduła Dorota Herman Paul Czodrowski Hugo Ceulemans Cross modality learning of cell painting and transcriptomics data improves mechanism of action clustering and bioactivity modelling Scientific Reports |
| title | Cross modality learning of cell painting and transcriptomics data improves mechanism of action clustering and bioactivity modelling |
| title_full | Cross modality learning of cell painting and transcriptomics data improves mechanism of action clustering and bioactivity modelling |
| title_fullStr | Cross modality learning of cell painting and transcriptomics data improves mechanism of action clustering and bioactivity modelling |
| title_full_unstemmed | Cross modality learning of cell painting and transcriptomics data improves mechanism of action clustering and bioactivity modelling |
| title_short | Cross modality learning of cell painting and transcriptomics data improves mechanism of action clustering and bioactivity modelling |
| title_sort | cross modality learning of cell painting and transcriptomics data improves mechanism of action clustering and bioactivity modelling |
| url | https://doi.org/10.1038/s41598-025-05914-0 |
| work_keys_str_mv | AT sonvha crossmodalitylearningofcellpaintingandtranscriptomicsdataimprovesmechanismofactionclusteringandbioactivitymodelling AT steffenjaensch crossmodalitylearningofcellpaintingandtranscriptomicsdataimprovesmechanismofactionclusteringandbioactivitymodelling AT maciejmkanduła crossmodalitylearningofcellpaintingandtranscriptomicsdataimprovesmechanismofactionclusteringandbioactivitymodelling AT dorotaherman crossmodalitylearningofcellpaintingandtranscriptomicsdataimprovesmechanismofactionclusteringandbioactivitymodelling AT paulczodrowski crossmodalitylearningofcellpaintingandtranscriptomicsdataimprovesmechanismofactionclusteringandbioactivitymodelling AT hugoceulemans crossmodalitylearningofcellpaintingandtranscriptomicsdataimprovesmechanismofactionclusteringandbioactivitymodelling |