EM-DeepSD: A Deep Neural Network Model Based on Cell-Free DNA End-Motif Signal Decomposition for Cancer Diagnosis

<b>Background and Objectives:</b> The accurate discrimination between patients with and without cancer using their cell-free DNA (cfDNA) is crucial for early cancer diagnosis. The end-motifs of cfDNA serve as significant cancer biomarkers, offering compelling prospects for cancer diagnos...

Full description

Saved in:
Bibliographic Details
Main Authors: Zhi-Yang Zhao, Chang-Ling Huang, Tong-Min Wang, Shi-Hao Zhou, Lu Pei, Wen-Hui Jia, Wei-Hua Jia
Format: Article
Language:English
Published: MDPI AG 2025-05-01
Series:Diagnostics
Subjects:
Online Access:https://www.mdpi.com/2075-4418/15/9/1156
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:<b>Background and Objectives:</b> The accurate discrimination between patients with and without cancer using their cell-free DNA (cfDNA) is crucial for early cancer diagnosis. The end-motifs of cfDNA serve as significant cancer biomarkers, offering compelling prospects for cancer diagnosis. This study proposes EM-DeepSD, a signal decomposition deep learning framework based on cfDNA end-motifs, which is aimed at improving the accuracy of cancer diagnosis and adapting to different sequencing modalities. <b>Materials and Methods:</b> This study included 146 patients diagnosed with cancer and 122 non-cancer controls. EM-DeepSD comprises three core modules. Initially, it utilizes a signal decomposition module to decompose and reconstruct the input end-motif profiles, thereby generating multiple regular subsequences that optimize the subsequent modeling process. Subsequently, both a machine learning module and a deep learning module are employed to improve the accuracy of cancer diagnosis. Furthermore, this paper compares the performance of EM-DeepSD with that of existing benchmarked methods to demonstrate its superiority. Based on the EM-DeepSD framework, we developed the EM-DeepSSA model and compared it with two benchmarked methods across different cfDNA sequencing datasets. <b>Results:</b> In the internal validation set, EM-DeepSSA outperformed the two benchmark methods for cancer diagnosis (area under the curve (AUC), 0.920; adjusted <i>p</i> value < 0.05). Meanwhile, EM-DeepSSA also exhibited the best performance on two independent external testing sets that were subjected to 5-hydroxymethylcytosine sequencing (5hmCS) and broad-range cell-free DNA sequencing (BR-cfDNA-Seq), respectively (test set-1: AUC = 0.933; test set-2: AUC = 0.956; adjusted <i>p</i> value < 0.05). <b>Conclusions:</b> In summary, we present a new framework which can achieve high classification performance in cancer diagnosis and which is applicable to different sequencing modalities.
ISSN:2075-4418