A novel family of beta mixture models for the differential analysis of DNA methylation data: An application to prostate cancer.

Identifying differentially methylated cytosine-guanine dinucleotide (CpG) sites between benign and tumour samples can assist in understanding disease. However, differential analysis of bounded DNA methylation data often requires data transformation, reducing biological interpretability. To address t...

Full description

Saved in:
Bibliographic Details
Main Authors: Koyel Majumdar, Romina Silva, Antoinette Sabrina Perry, Ronald William Watson, Andrea Rau, Florence Jaffrezic, Thomas Brendan Murphy, Isobel Claire Gormley
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2024-01-01
Series:PLoS ONE
Online Access:https://doi.org/10.1371/journal.pone.0314014
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850104941230686208
author Koyel Majumdar
Romina Silva
Antoinette Sabrina Perry
Ronald William Watson
Andrea Rau
Florence Jaffrezic
Thomas Brendan Murphy
Isobel Claire Gormley
author_facet Koyel Majumdar
Romina Silva
Antoinette Sabrina Perry
Ronald William Watson
Andrea Rau
Florence Jaffrezic
Thomas Brendan Murphy
Isobel Claire Gormley
author_sort Koyel Majumdar
collection DOAJ
description Identifying differentially methylated cytosine-guanine dinucleotide (CpG) sites between benign and tumour samples can assist in understanding disease. However, differential analysis of bounded DNA methylation data often requires data transformation, reducing biological interpretability. To address this, a family of beta mixture models (BMMs) is proposed that (i) objectively infers methylation state thresholds and (ii) identifies differentially methylated CpG sites (DMCs) given untransformed, beta-valued methylation data. The BMMs achieve this through model-based clustering of CpG sites and by employing parameter constraints, facilitating application to different study settings. Inference proceeds via an expectation-maximisation algorithm, with an approximate maximization step providing tractability and computational feasibility. Performance of the BMMs is assessed through thorough simulation studies, and the BMMs are used for differential analyses of DNA methylation data from a prostate cancer study. Intuitive and biologically interpretable methylation state thresholds are inferred and DMCs are identified, including those related to genes such as GSTP1, RASSF1 and RARB, known for their role in prostate cancer development. Gene ontology analysis of the DMCs revealed significant enrichment in cancer-related pathways, demonstrating the utility of BMMs to reveal biologically relevant insights. An R package betaclust facilitates widespread use of BMMs.
format Article
id doaj-art-355bc68b07954fb7aff87b9ec9317124
institution DOAJ
issn 1932-6203
language English
publishDate 2024-01-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS ONE
spelling doaj-art-355bc68b07954fb7aff87b9ec93171242025-08-20T02:39:13ZengPublic Library of Science (PLoS)PLoS ONE1932-62032024-01-011912e031401410.1371/journal.pone.0314014A novel family of beta mixture models for the differential analysis of DNA methylation data: An application to prostate cancer.Koyel MajumdarRomina SilvaAntoinette Sabrina PerryRonald William WatsonAndrea RauFlorence JaffrezicThomas Brendan MurphyIsobel Claire GormleyIdentifying differentially methylated cytosine-guanine dinucleotide (CpG) sites between benign and tumour samples can assist in understanding disease. However, differential analysis of bounded DNA methylation data often requires data transformation, reducing biological interpretability. To address this, a family of beta mixture models (BMMs) is proposed that (i) objectively infers methylation state thresholds and (ii) identifies differentially methylated CpG sites (DMCs) given untransformed, beta-valued methylation data. The BMMs achieve this through model-based clustering of CpG sites and by employing parameter constraints, facilitating application to different study settings. Inference proceeds via an expectation-maximisation algorithm, with an approximate maximization step providing tractability and computational feasibility. Performance of the BMMs is assessed through thorough simulation studies, and the BMMs are used for differential analyses of DNA methylation data from a prostate cancer study. Intuitive and biologically interpretable methylation state thresholds are inferred and DMCs are identified, including those related to genes such as GSTP1, RASSF1 and RARB, known for their role in prostate cancer development. Gene ontology analysis of the DMCs revealed significant enrichment in cancer-related pathways, demonstrating the utility of BMMs to reveal biologically relevant insights. An R package betaclust facilitates widespread use of BMMs.https://doi.org/10.1371/journal.pone.0314014
spellingShingle Koyel Majumdar
Romina Silva
Antoinette Sabrina Perry
Ronald William Watson
Andrea Rau
Florence Jaffrezic
Thomas Brendan Murphy
Isobel Claire Gormley
A novel family of beta mixture models for the differential analysis of DNA methylation data: An application to prostate cancer.
PLoS ONE
title A novel family of beta mixture models for the differential analysis of DNA methylation data: An application to prostate cancer.
title_full A novel family of beta mixture models for the differential analysis of DNA methylation data: An application to prostate cancer.
title_fullStr A novel family of beta mixture models for the differential analysis of DNA methylation data: An application to prostate cancer.
title_full_unstemmed A novel family of beta mixture models for the differential analysis of DNA methylation data: An application to prostate cancer.
title_short A novel family of beta mixture models for the differential analysis of DNA methylation data: An application to prostate cancer.
title_sort novel family of beta mixture models for the differential analysis of dna methylation data an application to prostate cancer
url https://doi.org/10.1371/journal.pone.0314014
work_keys_str_mv AT koyelmajumdar anovelfamilyofbetamixturemodelsforthedifferentialanalysisofdnamethylationdataanapplicationtoprostatecancer
AT rominasilva anovelfamilyofbetamixturemodelsforthedifferentialanalysisofdnamethylationdataanapplicationtoprostatecancer
AT antoinettesabrinaperry anovelfamilyofbetamixturemodelsforthedifferentialanalysisofdnamethylationdataanapplicationtoprostatecancer
AT ronaldwilliamwatson anovelfamilyofbetamixturemodelsforthedifferentialanalysisofdnamethylationdataanapplicationtoprostatecancer
AT andrearau anovelfamilyofbetamixturemodelsforthedifferentialanalysisofdnamethylationdataanapplicationtoprostatecancer
AT florencejaffrezic anovelfamilyofbetamixturemodelsforthedifferentialanalysisofdnamethylationdataanapplicationtoprostatecancer
AT thomasbrendanmurphy anovelfamilyofbetamixturemodelsforthedifferentialanalysisofdnamethylationdataanapplicationtoprostatecancer
AT isobelclairegormley anovelfamilyofbetamixturemodelsforthedifferentialanalysisofdnamethylationdataanapplicationtoprostatecancer
AT koyelmajumdar novelfamilyofbetamixturemodelsforthedifferentialanalysisofdnamethylationdataanapplicationtoprostatecancer
AT rominasilva novelfamilyofbetamixturemodelsforthedifferentialanalysisofdnamethylationdataanapplicationtoprostatecancer
AT antoinettesabrinaperry novelfamilyofbetamixturemodelsforthedifferentialanalysisofdnamethylationdataanapplicationtoprostatecancer
AT ronaldwilliamwatson novelfamilyofbetamixturemodelsforthedifferentialanalysisofdnamethylationdataanapplicationtoprostatecancer
AT andrearau novelfamilyofbetamixturemodelsforthedifferentialanalysisofdnamethylationdataanapplicationtoprostatecancer
AT florencejaffrezic novelfamilyofbetamixturemodelsforthedifferentialanalysisofdnamethylationdataanapplicationtoprostatecancer
AT thomasbrendanmurphy novelfamilyofbetamixturemodelsforthedifferentialanalysisofdnamethylationdataanapplicationtoprostatecancer
AT isobelclairegormley novelfamilyofbetamixturemodelsforthedifferentialanalysisofdnamethylationdataanapplicationtoprostatecancer