A multiomics dataset of paired CT image and plasma cell-free DNA end motif for patients with pulmonary nodules

Abstract Diagnosing lung cancer at a curable stage offers the opportunity for a favorable prognosis. The emerging epigenomics analysis on plasma cell-free DNA (cfDNA), including 5-methylcytosine (5mC) and 5-hydroxymethylcytosine (5hmC) modifications, has acted as a promising approach facilitating th...

Full description

Saved in:
Bibliographic Details
Main Authors: Mengmeng Zhao, Gang Xue, Bingxi He, Jiajun Deng, Tingting Wang, Yifan Zhong, Shenghui Li, Yang Wang, Yiming He, Tao Chen, Jun Zhang, Ziyue Yan, Xinlei Hu, Liuning Guo, Wendong Qu, Yongxiang Song, Minglei Yang, Guofang Zhao, Bentong Yu, Minjie Ma, Lunxu Liu, Xiwen Sun, Deping Zhao, Dan Xie, Chang Chen, Yunlang She
Format: Article
Language:English
Published: Nature Portfolio 2025-04-01
Series:Scientific Data
Online Access:https://doi.org/10.1038/s41597-025-04912-1
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Abstract Diagnosing lung cancer at a curable stage offers the opportunity for a favorable prognosis. The emerging epigenomics analysis on plasma cell-free DNA (cfDNA), including 5-methylcytosine (5mC) and 5-hydroxymethylcytosine (5hmC) modifications, has acted as a promising approach facilitating the identification of lung cancer. And, integrating 5mC biomarker with chest computed tomography (CT) image features could optimize the diagnosis of lung cancer, exceeding the performance of models built on single feature. However, the clinical applicability of integrated markers might be limited by the potential risk of overfitting due to small sample size. Hence, we prospectively collected peripheral blood sample and the paired chest CT images of 2032 patients with indeterminate pulmonary nodules across 5 centers, and constructed a large-scale, multi-institutional, multiomics database that encompass CT imaging data and plasma cfDNA fragmentomic in 5mC-, 5hmC-enriched regions. To our best knowledge, this dataset is the first radio-epigenomic dataset with the largest sample size, and provides multi-dimensional insights for early diagnosis of lung cancer, facilitating the individuated management for lung cancer.
ISSN:2052-4463