Bigpicc: a graph-based approach to identifying carcinogenic gene combinations from mutation data

Abstract Genome data from cancer patients represents relationships between the presence of a gene mutation and cancer occurrence in a patient. Different types of cancer in human are thought to be caused by combinations of two to nine gene mutations. Identifying these combinations through traditional...

Full description

Saved in:
Bibliographic Details
Main Authors: Vladyslav Oles, Sajal Dash, Ramu Anandakrishnan
Format: Article
Language:English
Published: BMC 2025-06-01
Series:BMC Bioinformatics
Subjects:
Online Access:https://doi.org/10.1186/s12859-025-06043-1
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850224343373578240
author Vladyslav Oles
Sajal Dash
Ramu Anandakrishnan
author_facet Vladyslav Oles
Sajal Dash
Ramu Anandakrishnan
author_sort Vladyslav Oles
collection DOAJ
description Abstract Genome data from cancer patients represents relationships between the presence of a gene mutation and cancer occurrence in a patient. Different types of cancer in human are thought to be caused by combinations of two to nine gene mutations. Identifying these combinations through traditional exhaustive search requires the amount of computation that scales exponentially with the combination size and in most cases is intractable even for cutting-edge supercomputers. We propose a parameter-free heuristic approach that leverages the intrinsic topology of gene-patient mutations to identify carcinogenic combinations. The biological relevance of the identified combinations is measured by using them to predict the presence of tumor in previously unseen samples. The resulting classifiers for 16 cancer types perform on par with exhaustive search results, and score the average of 80.1% sensitivity and 91.6% specificity for the best choice of hit range per cancer type. Our approach is able to find higher-hit carcinogenic combinations targeting which would take years of computations using exhaustive search.
format Article
id doaj-art-c2dba821632e47cbb669f1eebd86002b
institution OA Journals
issn 1471-2105
language English
publishDate 2025-06-01
publisher BMC
record_format Article
series BMC Bioinformatics
spelling doaj-art-c2dba821632e47cbb669f1eebd86002b2025-08-20T02:05:39ZengBMCBMC Bioinformatics1471-21052025-06-0126111810.1186/s12859-025-06043-1Bigpicc: a graph-based approach to identifying carcinogenic gene combinations from mutation dataVladyslav Oles0Sajal Dash1Ramu Anandakrishnan2Oak Ridge National Laboratory, National Center for Computational SciencesOak Ridge National Laboratory, National Center for Computational SciencesVirginia Tech, Edward Via College of Osteopathic MedicineAbstract Genome data from cancer patients represents relationships between the presence of a gene mutation and cancer occurrence in a patient. Different types of cancer in human are thought to be caused by combinations of two to nine gene mutations. Identifying these combinations through traditional exhaustive search requires the amount of computation that scales exponentially with the combination size and in most cases is intractable even for cutting-edge supercomputers. We propose a parameter-free heuristic approach that leverages the intrinsic topology of gene-patient mutations to identify carcinogenic combinations. The biological relevance of the identified combinations is measured by using them to predict the presence of tumor in previously unseen samples. The resulting classifiers for 16 cancer types perform on par with exhaustive search results, and score the average of 80.1% sensitivity and 91.6% specificity for the best choice of hit range per cancer type. Our approach is able to find higher-hit carcinogenic combinations targeting which would take years of computations using exhaustive search.https://doi.org/10.1186/s12859-025-06043-1Community detectionBinary classificationDriver mutationsTCGA
spellingShingle Vladyslav Oles
Sajal Dash
Ramu Anandakrishnan
Bigpicc: a graph-based approach to identifying carcinogenic gene combinations from mutation data
BMC Bioinformatics
Community detection
Binary classification
Driver mutations
TCGA
title Bigpicc: a graph-based approach to identifying carcinogenic gene combinations from mutation data
title_full Bigpicc: a graph-based approach to identifying carcinogenic gene combinations from mutation data
title_fullStr Bigpicc: a graph-based approach to identifying carcinogenic gene combinations from mutation data
title_full_unstemmed Bigpicc: a graph-based approach to identifying carcinogenic gene combinations from mutation data
title_short Bigpicc: a graph-based approach to identifying carcinogenic gene combinations from mutation data
title_sort bigpicc a graph based approach to identifying carcinogenic gene combinations from mutation data
topic Community detection
Binary classification
Driver mutations
TCGA
url https://doi.org/10.1186/s12859-025-06043-1
work_keys_str_mv AT vladyslavoles bigpiccagraphbasedapproachtoidentifyingcarcinogenicgenecombinationsfrommutationdata
AT sajaldash bigpiccagraphbasedapproachtoidentifyingcarcinogenicgenecombinationsfrommutationdata
AT ramuanandakrishnan bigpiccagraphbasedapproachtoidentifyingcarcinogenicgenecombinationsfrommutationdata